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i INHERITED AND SOMATIC MUTATIONS OF 

APC GENE IN COLORECTAL CANCER OF HUMANS 

The U.S. Government has a paid-up license in this invention and 
the right in limited circumstances to require the patent owner to 
license others on reasonable terms as provided for by the terms of 
grants awarded by the National Institutes of Health. 
TECHNICAL AREA OF THE INVENTION 

The invention relates to the area of cancer diagnostics and ther- 
apeutics. More particularly, the invention relates to detection of the 
germline and somatic alterations of wild-type APC genes. In addition, 
it relates to therapeutic intervention to restore the function of APC 
gene product. 

BACKGROUND OF THE INVENTION 

According to the model of Knudson for tumorigenesis (Cancer 
Research, Vol. 45, p. 1482, 1985), there are tumor suppressor genes in 
all normal cells which, when they become non-functional due to muta- 
tion, cause neoplastic development. Evidence for this model has been 
found in the cases of retinoblastoma and colorectal tumors. The impli- 
cated suppressor genes in those tumors, RB, p53, DCC and MCC, were 
found to be deleted or altered in many cases of the tumors studied. 
(Hansen and Cavenee, Cancer Research, Vol.. 47, pp. 5518-5527 (1987); 
Baker et al.. Science, VoL. 244, p. 217 (1989); Fearon et aL f Science, 
Vol. 247, p. 49 (1990); Kinzler et al. Science Vol. 251. p. 1366 (1991).) 

In order to fully understand the pathogenesis of tumors, it will 
be necessary to identify the other suppressor genes that play a role in 
the tumorigenesis process. Prominent among these is the onefc) pre- 
sumptively located at 5q21. Cytogenetic (Herrera et al., Am J. Med. 
Genet. . VoL 25, p. 473 (1986) and linkage (Leppert et al., Science, Vol. 
238, p. 1411 (1987); Bodmer et al., Nature, VoL 328, p. 614 (1987)) stud- 
ies have shown that this chromosome region harbors the gene 
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responsible for familial adenomatous polyposis (FAP) and Gardner's 
Syndrome (GS). FAP is an autosomal-dominant, inherited disease in 
which affected individuals develop hundreds to thousands of 
adenomatous polyps, some of which progress to malignancy. GS is a 
variant of FAP in which desmoid tumors, osteomas and other soft tissue 
tumors occur together with multiple adenomas of the colon and rec- 
tum. A less severe form of polyposis has been identified in which only 
a few (2-40) polyps develop. This condition also is familial and is linked 
to the same chromosomal markers as FAP and GS (Leppert et aL, New 
England Journal of Medicine, VoL 322, pp. 904-908, 1990.) Additionally, 
this chromosomal region is often deleted from toe adenomas 
(Vogelstein et aL, N. EngL J. Med., VoL 319, p. 525 (1988)) and carcino- 
mas (Vogelstein et aL, N. EngL J. Med., VoL 319, p. 525 (1988); Solomon 
et al., Nature, VoL 328, p. 616 (1987); Sasaki et aL, Cancer Research, 
VoL 49, p. 4402 (1989); Delattre et aL, Lancet, Vol. 2, p. 353 (1989); and 
Ashton-Rickardt et aL, Oncogene, VoL 4, p. 1169 (1989)) of patients 
without FAP (sporadic tumors). Thus, a putative suppressor gene on 
chromosome 5q21 appears to play a role in the early stages of 
colorectal people in both sporadic and familial tumors. 

Although the MCC gene has been identified on 5q2l as a candi- 
date suppressor gene, it does not appear to be altered in FAP or GS 
patients. Thus there is a need In the art for investigations of this chro- 
mosomal region to identify genes and to determine if any of such genes 
are associated with FAP and/or GS and the process of tumorigenesis. 

SUMMARY OK THE INVENTION 

It is an object of the present invention to provide a method for 
rH a gnrtdng and prognosing a neoplastic tissue of a human. 

It is another object of the invention to provide a method of 
detecting genetic predisposition to cancer. 

It is another object of the invention to provide a method of sup- 
plying wild-type APC gene function to a cell which has lost said gene 
function. 

It is yet another object of the invention to provide a kit for 
determination of the nucleotide sequence of APC alleles by the 
polymerase chain reaction. 
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It is still another object of the invention to provide nucleic acid 
probes for detection of mutations in the human APC gene. 

It Is still another object of the invention to provide a cDNA mol- 
ecule encoding the APC gene product. 

It is yet another object of the invention to provide a preparation 
of the human APC protein. 

It is another object of the invention to provide a method of 
screening for genetic predisposition to cancer. 

It is an object of the invention to provide methods of testing 
therapeutic agents for the ability to suppress neoplasia. 

It is still another object of the invention to provide animals car- 
rying mutant APC alleles. 

These and other objects of the invention are provided by one or 
more of the embodiments which are described below. In one embodi- 
ment of the present invention a method of diagnosing or prognosing a 
neoplastic tissue of a human is provided comprising: detecting somatic 
alteration of wild-type APC genes or their expression products in a 
sporadic colorectal cancer tissue, said alteration indicating ™^pTfl;s1a of 
the tissue. - 

In yet another embodiment a method is provided of detecting 
genetic predisposition to cancer in a human including familial 
adenomatous polyposis (FAP) and Gardner's Syndrome (GS), comprising: 
isolating a human sample selected from the group consisting of blood 
and fetal tissue; detecting alteration of wild-type APC gene coding 
sequences or their expression products from the sample, said alteration 
indicating genetic predisposition to cancer. 

In another embodiment of the present invention a method is 
provided for supplying wild-type APC gene function to a cell which has 
lost said gene function by virtue of a mutation in the APC gene, com- 
prising: introducing a wild-type APC gene into a cell which has lost 
said gene function such that said wild-type gene is expressed in the 
cell. 

In another embodiment a method of supplying wild-type APC 
gene function to a cell is provided comprising: introducing a portion of 
a wild-type APC gene into a cell which has lost said gene function such 
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that said portion is expressed in the cell, said portion encoding a part 
of the APC protein which is required for non-neoplastic growth of said 
cell. APC protein can also be applied to cells or administered to ani- 
mals to remediate for mutant APC genes. Synthetic peptides or drugs 
can also be used to mimic APC function in cells which have altered 
APC expression. 

In yet another embodiment a pair of single stranded primers is 
provided for determination of the nucleotide sequence of the APC gene 
by polymerase *h«fa reaction. The sequence of said pair of single 
stranded DNA primers is derived from chromosome 5q band 21, said 
pair of primers allowing synthesis of APC gene coding sequences. 

In still another embodiment of the invention a nucleic acid probe 
is provided which is complementary to human wild-type APC gene cod- 
ing sequences and which can form mismatches with mutant APC genes, 
thereby allowing their detection by enzymatic or c hemic al cleavage or 
by shifts in electrophoretic mobility. 

In another embodiment of the invention a method is provided for 
detecting the presence of a neoplastic tissue in a human. The method 
comprises isolating a body sample from a human; detecting in said sam- 
ple alteration of a wild-type APC gene sequence or wild-type APC 
expression product, said alteration indicating the presence of a 
neoplastic tissue in the human. 

In still another embodiment a cDNA molecule is provided which 
comprises the coding sequence of the APC gene. 

In even another embodiment a preparation of the human APC 
protein is provided which is substantially free of other human proteins. 
The amino acid sequence of the protein is shown in Figure 3 or 7. 

In yet another embodiment of the invention a method is provided 
f or scre eping for genetic predisposition to cancer, including familial 
adenomatous polyposis (FAP) and Gardner's Syndrome (GS), in a human. 
The method comprises: detecting among kindred persons the presence 
of a DNA polymorphism which is linked to a mutant APC allele in an 
individual having a genetic predisposition to cancer, said kindred being 
genetically related to the individual, the presence of said polymorphism 
suggesting a predisposition to cancer. 
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In another embodiment of the inv ntion a method of testing 
therapeutic agents for the ability to suppress a neoplastically trans- 
formed phenotype is provided. The method comprises: applying a test 
substance to a cultured epithelial cell which carries a mutation in an 
APC allele; and determining whether said test substance suppresses 
the neoplastically transformed phenotype of the cell* 

In another embodiment of the invention a method of testing 
therapeutic agents for the ability to suppress a neoplastically trans- 
formed phenotype is provided. The method comprises: administering a 
test substance to an animal which carries a mutant APC allele; and 
determining whether said test substance prevents or suppresses the 
growth of tumors. 

In still other embodiments of the invention transgenic animaig 
are provided. The animals carry a mutant APC allele from a second 
animal species or have been genetically engineered to contain an inser- 
tion mutation which disrupts an APC allele. 

The present invention provides the art with the information that 
the APC gene, a heretofore unknown gene is, in fact, a target of muta- 
tional alterations on chromosome 5q21 and that these alterations are 
associated with the process of tumorigenesis. This information allows 
highly specific assays to be performed to assess the neoplastic status of 
a particular tissue or the predisposition to cancer of an individual. This 
invention has applicability to Familial Adenomatous Polyposis, sporadic 
colorectal cancers, Gardner's Syndrome, as well as the less severe 
familial polyposis discusses above. 
BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1A shows an overview of yeast artificial chromosome 
(TAC) contlgs. Genetic distances between selected RFLP markers 
from within the contlgs are shown in centiMorgans. 

Figure IB shows a detailed map of the three central contlgs. 
The position of the six identified genes from within the FAP region is 
shown: the 5' and 3' ends of the transcripts from these genes have in 
general not yet been isolated, as incflcated by the string of dots sur- 
rounding the bars denoting the genes* positions. Selected restriction 
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endonuclease recognition sites are indicated. B y BssH2; S, SstH; 
M, Mlul; N, NruL 

Figure 2 shows the sequence of TBI and TB2 genes. The cDNA 
sequence of the TBI gene was determined from the analysis of 11 
cDNA clones derived from normal colon and liver t as described in the 
text. A total of 2314 bp were contained within the overlapping cDNA 
clones, defining an ORF of 424 amino acids beginning at nucleotide 1. 
Only the predicted amino acids from the ORF are shown. The 
carboxjr terminal end of the ORF has apparently been identified, but 
the 5' end of the TBI transcript has not yet been precisely determined. 

The cDNA sequence of the TB2 gene was determined from the 
YS-39 clone derived as described in the text. This clone consisted of 
2300 bp and defined an ORF of 185 amino acids beginning at nucleotide 
1. Only the predicted amino acids are shown. The carboxy terminal 
end of the ORF has apparently been identified, but the 5' end of the 
TB2 transcript has not been precisely determined. 

Figure 3 shows the sequence of the APC gene product. The 
cDNA sequence was determined through the analysis of 87 cDNA clones 
derived from normal colon, liver, and brain. A total of 8973 bp were 
contained within overlapping cDNA clones, defining an ORF of 2842 
amino adds. In frame stop codecs surrounded this ORF, as described in 
the text, suggesting that the entire APC gene product was represented 
in the ORF illustrated. Only the predicted amino acids are shown. 

Figure 4 shows the local similarity between human APC and ral2 
of yeast. Local similarity among the APC and MCC genes and the m3 
muscarinic acetylcholine receptor is shown. The region of the mAChR 
shown corresponds to that responsible for coupling the receptor to G 
proteins. The connecting lines indicate identities; dots indicate related 
amino adds residues. 

Figure 5 shows the genomic map of the 1200 kb NotI fragment at 
the FAP locus. The NotI fragment is shown as a bold line. Relevant 
parts of the deletion chromosomes from patients 3214 and 3824 are 
shown as stippled lines. Probes used to characterize the NotI fragment 
and the deletions, and three YACs from which subclones were obtained, 
are shown below the restriction map. The chimeric end of TAC 
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183H12 is indicated by a dotted line. The orientation and approximate 
position of MCC are indicated above the map. 

Figure 6 shows the DNA sequence and predicted amino acid 
sequence of DPI (TB2). The nucleotide numbering begins at the most 5' 
nucleotide isolated. A proposed initiation methionine (base 77) is indi- 
cated in bold type. The entire coding sequence is presented. 

Figure 7 shows the cONA and predicted amino acid sequence of 
DP2.S (APC). The nucleotide numbering begins at the proposed initia- 
tion methionine. The nucleotides and amino acids of the alternatively 
spliced exon (exon 9; nucleotide positions 934-1236) are presented in 
lower case letters. At the 3- end, a poly(A) addition signal occurs at 
9530, and one cDNA clone has a poly(A) at 9563. Other cDNA clones 
extend beyond 9563, however, and their consensus sequence is included 
here. 

Figure 8 shows the arrangement of exons in DP2.5 (APC). 
(A) Exon 9 corresponds to nucleotides 933-1312; exon 9a corresponds to 
nucleotides 1236-1312. The stop codon in the cDNA is at nucleotide 
8535. (B) Partial intronic sequence surrounding each exon Is shown. 

DKTATT.im DESCRIPTION 

It is a discovery of the present invention that mutational events 
associated with tumorigenesis occur in a previously unknown gene on 
chromosome 5q named here the APC (Adenomatous Polyposis Coli) 
gene. Although it was previously known that deletion of alleles on 
chromosome 5q were common in certain types of cancers, it was not 
known that a target gene of these deletions was the APC gene. Fur- 
ther it was not known that other types of mutational events in the APC 
gene are also associated with cancers. The mutations of the APC gene 
can involve gross rearrangements, such as insertions and deletions. 
Point mutations have also been observed. 

According to the diagnostic and prognostic method of the 
present invention, alteration of the wild-type APC gene is detected. 
"Alteration of a wild-type gene" according to the present invention 
encompasses all forms of mutations — including deletions. The alter- 
ation may be due to either rearrangements such as insertions. Inver- 
sions, and deletions, or to point mutations. Deletions may be of the 
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entire gene or only a portion of the gene. Somatic mutations are those 
which occur only in certain tissues, e*g M in the tumor tissue, and are 
not inherited iu the germline. Germline mutations can be found in any 
of a bodys tissues. If only a single allele is somatically mutated, an 
early neoplastic state is indicated. However, if both alleles are 
mutated then a late neoplastic state is indicated. The finding of APC 
mutations thus provides both diagnostic and prognostic information. 
An APC allele which is not deleted (e.g., that on the sister chromosome 
to a chromosome carrying an APC deletion) can be screened for other 
mutations, such as insertions, small deletions, and point mutations. It 
is believed that many mutations found in tumor tissues will be those 
leading to decreased expression of the APC gene product. However, 
mutations leading to non-functional gene products would also lead to a 
cancerous state. Point mutational events may occur in regulatory 
regions, such as in the promoter of the gene, leacBng to loss or cfiminu- 
tion of expression of the mRNA. Point mutations may also abolish 
proper RNA processing, leading to loss of expression of the APC gene 
product. 

In order to detect the alteration of the wild-type APC gene in a 
tissue, it is helpful to isolate the tissue free from surrounding normal 
tissues. Means for enriching a tissue preparation for tumor cells are 
town in the art. For example, the tissue may be isolated from paraf- 
fin or cryostat sections. Cancer cells may also be separated from nor- 
mal cells by flow cytometry. These as well as other techniques for 
separating tumor from normal cells are well known in the art. If the 
tumor tissue is highly contaminated with normal cells, detection of 
mutations is more difficult. 

Detection of point mutations may be accomplished by molecular 
cloning of the APC allele (or alleles) and sequencing that allelefe) using 
techniques well known in the art. Alternatively, the polymerase chain 
reaction (PCR) can be used to amplify gene seq u ences directly from a 
genomic DNA preparation from the tumor tissue. The DNA sequence 
of the amplified sequences can then be determined. The polymerase 
chain reaction itself is well known in the art. See, a* M Saiki et aL, 
Science, Vol. 239, p. 487, 1988; U.S. 4,683,203; and U.S. 4,683,195. 
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Speclfic primers which can be used in order to amplify the gene will 
be discussed in more detail below. The ligase chain reaction, which Is 
known in the art, can also be used to amplify APC sequences. See Wu 
| et ah, Genomics , VoL 4, pp. 560-569 (1989). In addition, a technique 

known as allele specific PGR can be used. (See Ruano and Kidd, 
Nucleic Acids Research, Vol. 17, p. 8392, 1989.) According to this 
technique, primers are used which hybridize at their 3* ends to a par- 
ticular APC mutation. If the particular APC mutation is not present, 
an amplification product is not observed. Amplification Refractory 
Mutation System (ARMS) can also be used as disclosed in European 
Patent Application Publication No. 0332435 and in Newton et al., 
Nucleic Acids Research, VoL 17, p.7, 1989. Insertions and deletions of 
genes can also be detected by cloning, sequencing and amplification. In 
addition, restriction fragment length polymorphism (RFLP) probes for 
the gene or surrounding marker genes can be used to score alteration 
of an allele or an insertion in a polymorphic fragment. Such a method 
Is particularly useful for screening among kindred persons of an 
affected individual for the presence of the APC mutation found in that 
individual. Single stranded conformation polymorphism (SSCP) analysis 
can also be used to detect base change variants of an allele. (Orita et 
al., Proc Natl. Acad. ScL USA VoL 86, pp. 2766-2770, 1989, and 
Genomics, VoL 5, pp. 874-879, 1989.) Other techniques for detecting 
insertions and deletions as are known in the art can be used. 

Alteration of wild-type genes can also be detected on the basis 
of the alteration of a wild-type expression product of the gene. Such 
expression products Include both the APC mRNA as well as the APC 
protein product. The sequences of these products are shown in 
Figures 3 and 7. Point mutations may be detected by amplifying and 
sequencing the mRNA or via molecular cloning of cDNA made from the 
mRNA. The sequence of the cloned cDNA can be determined using 
DNA sequenci ng techniques which are well known in the art. The 
cDNA can also be sequenced via the polymerase chain reaction (PCR) 
which will be discussed in more detail below. 

Mismatches, according to the present invention are hybridized 
nucleic add duplexes which are not 100% homologous. Th lack of 
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total homology may be due to deletions, insertions, inversions, substitu- 
tions or frameshift mutations. Mismatch detection can be used to 
detect point mutations in the gene or its mRNA product. While these 
techniques are less sensitive than sequencing, they are simpler to per- 
form on a large number of turgor samples. An example of a mismatch 
cleavage technique is the RNase protection method, which is described 
in detail in Winter et al., Proc. NatL Acad. ScL USA, Vol. 82, p. 7575, 
1985 and Meyers et al.. Science, VoL 230, p. 1242, 1985. In the practice 
of the present invention the method Involves the use of a labeled 
riboprobe which is complementary to the human wild-type APC gene 
coding sequence. The riboprobe and either mRNA or DNA isolated 
from the tumor tissue are annealed (hybridized) together and subse- 
quently digested with the enzyme RNase A which is able to detect 
some mismatches in a duplex RNA structure. If a mismatch is detected 
by RNase A, it cleaves at the site of the mismatch. Thus, when the 
annealed RNA preparation is separated on an electrophoretic gel 
matrix, if a mismatch has been detected and cleaved by RNase A, an 
RNA product will be seen which is smaller than the full-length duplex 
RNA for the riboprobe and the mRNA or DNA. The riboprobe need not 
be the full length of the APC mRNA or gene but can be a segment of 
either. If the riboprobe comprises only a segment of the APC mRNA or 
gene it will be desirable to use a number of these probes to screen the 
whole mRNA sequence for mismatches. 

In similar fashion, DNA probes can be used to detect mis- 
matches, through enzymatic or chemical cleavage. See, ag M Cotton et 
aL, Proc. NatL Acad. ScL USA, VoL 85, 4397, 1988; and Shenk et aL, 
Proc. NatL Acad. ScL USA, VoL 72, p. 989, 1975. Alternatively, mis- 
matches can be detected by diifts in the electrophoretic mobility of 
mismatched duplexes relative to matched duplexes. See, e.g., Cariello, 
Human Genetics, VoL 42, p. 726, 1988. With either riboprobes or DNA 
probes, the cellular mRNA or DNA which might contain a mutation can 
be amplified using PCR (see below) before hybridisation. Changes in 
DNA of the APC gene can also be detected using Southern hybridiza- 
tion, especially if the changes are gross rearrangements, such as dele- 
tions and insertions. 
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DNA sequences of the APC gen which have been amplified by 
use of polymerase chain reaction may also be screened using allele-spe- 
cific probes. These probes are nucleic acid oligomers, each of which 
| contains a region of the APC gene sequence harboring a known muta- 

tion. For example, one oligomer may be about 38 nucleotides in length, 
corresponding to a portion of the APC gene sequence. By use of a bat- 
tery of such allele-specific probes, PCR amplification products can be 
screened to identify the presence of a previously identified mutation in 
the APC gene. Hybridization of allele-specific probes with amplified 
APC sequences can be performed, for example, on a nylon filter. 
Hybridization to a particular probe under stringent hybridization condi- 
tions indicates the presence of the same mutation in the tumor tissue 
as in the allele-specific probe. 

Alteration of APC mRNA expression can be detected by any 
technique known in the art. These include Northern blot analysis, PCR 
amplification and RNase protection. Diminished mRNA expression 
indicates an alteration of the wild-type APC gene. 

Alteration of wild-type APC genes can also be detected by 
screening for alteration of wild-type APC protein. For example, 
monoclonal antibodies immunoreactive with APC can be used to screen 
a tissue. Lack of cognate antigen would indicate an APC mutation. 
Antibodies specific for products of mutant alleles could also be used to 
detect mutant APC gene product. Such immunological assays can be 
done in any convenient format known in the art. These include West- 
ern blots, immunohistochemical assays and ELISA assays. Any means 
for detecting an altered APC protein can be used to detect alteration 
of wild-type APC genes. Functional assays can be used, such as protein 
binding determinations. For example, it is believed that APC protein 
oligomerizes to itself and/or MCC protein or binds to a G protein. 
Thus, an assay for the ability to bind to wild type APC or MCC protein 
or that G protein can be employed, in addition, assays can be used 
which detect APC biochemical function. It is believed that APC is 
involved in phospholipid metabolism. Thus, assaying the enzymatic 
products of the involved phospholipid metabolic pathway can be used to 
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determine APC activity. Finding a mutant APC gene product indicates 
alteration of a wild-type APC gene. 

Mutant APC genes or gene products can also be detected in 
other human body samples, such as, serum, stool, urine and sputum. 
The same techniques discussed above for detection of mutant APC 
genes or gene products in tissues can be applied to other body samples. 
Cancer cells are sloughed off from tumors and appear in such body 
samples. In addition, the APC gene product Itself may be secreted into 
the extracellular space and found in these body samples even in the 
absence of cancer cells. By screening such body samples, a simple 
early diagnosis can be achieved for many types of cancers. In addition, 
the progress of chemotherapy or radiotherapy can be monitored more 
easily by testing such body samples for mutant APC genes or gene 
products. 

The methods of diagnosis of the present invention are applicable 
to any tumor in which APC has a rale in tumorigenesis. Deletions of 
chromosome arm 5q have been observed in tumors of lung, breast, 
colon, rectum, bladder, liver, sarcomas, stomach and prostate, as well 
as in leukemias and lymphomas. Thus these are likely to be tumors in 
which APC has a role. The diagnostic method of the present invention 
Is useful for clinicians so that they can decide upon an appropriate 
soarse of treatment. For example, a tumor displaying alteration of 
iwth APC alleles might suggest a more aggressive therapeutic regimen 
Ifoan a tumor displaying alteration of only one APC allele. 

The primer pairs of the present invention are useful for determi- 
nation of the nucleotide sequence of a particular APC allele using the 
fntymerase chain reaction. The pairs of single stranded ONA primers 
can be annealed to sequences within or surrounding the APC gene on 
doomosome 5q in order to prime amplifying DNA synthesis of the APC 
@me itself. A complete set of these primers allows synthesis of all of 
tte nucleotides of the APC gene coding sequences, Le., the exons. The 
set of primers preferably allows synthesis of both intron and exon 
sequences. Allele specific primers can also be used. Such primers 
ameal only to particular APC mutant alleles, and thus will only amplify 
a product in the presence of the mutant allele as a template. 
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In order to facilitate subsequ nt cloning of amplified sequences, 
primers may have restriction enzyme site sequences appended to their 
5* ends. Thus, all nucleotides of the primers are derived from APC 
sequences or sequences adjacent to APC except the few nucleotides 
necessary to form a restriction enzyme site. Such enzymes and sites 
are well known in the art. The primers themselves can be synthesized 
using techniques which are well known in the art. Generally, the prim- 
ers can be made using oligonucleotide synthesizing machines which are 
commercially available. Given the sequence of the APC open reading 
frame shown in Figure 7, design of particular primers is well within the 
skill of the art. 

The nucleic acid probes provided by the present invention are 
useful for a number of purposes. They can be used in Southern hybrid- 
ization to genomic DMA and in the RNase protection method for 
detecting point mutations already discussed above. The probes can be 
used to detect PCR amplification products* They may also be used to 
detect mismatches with the APC gene or mRNA using other tech- 
niques. Mismatches can be detected using either enzymes (e.g., Si 
nuclease), chemicals (e.g., hydroxylamine or osmium tetraxide and 
piperldine), or changes in electrophoretic mobility of mismatched 
hybrids as compared to totally matched hybrids. These techniques are 
known in the art. See, Cotton, supra. Shenk, supra . Myers, supra . Win- 
ter, suora . and Movack et aL ff Proc. NatL Acad. Set USA, VOL 83, p. 
586, 1986. Generally, the {robes are complementary to APC gene cod- 
ing sequences, although probes to certain introns are also contem- 
plated. An entire battery of nucleic add probes is used to compose a 
kit for detecting alteration of wild-type APC genes. The kit allows for 
hybridization to the entire APC gene. The probes may overlap with 
each other or be contiguous. 

If a riboprobe is used to detect mismatches with mRNA, it is 
complementary to the mRNA of the human wild-type APC gene. The 
riboprobe thus is an anti-sense probe in that it does not code for the 
APC protein because it is of the opposite polarity to the sense strand. 
The riboprobe generally will be labeled with a radioactive, 
colorimetric, or fluorometric material, which can be accomplished by 
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any means known in the art. If the riboprobe is used to detect mis- 
matches with ONA it can be of either polarity, sense or anti-sense. 
Similarly, DNA probes also may be used to detect mismatches* 

Nucleic acid probes may also be complementary to mutant 
alleles of the APC gene. These are useful to detect similar mutations 
in other patients on the basis of hybridization rather than mismatches. 
These are discussed above and referred to as allele-specif ic probes. As 
mentioned above, the APC probes can also be used in Southern hybrid- 
izations to genomic DNA to detect gross chromosomal changes such as 
deletions and insertions. The probes can also be used to select cDNA 
clones of APC genes from tumor and normal tissues. In addition, the 
probes can be used to detect APC mRNA in tissues to determine if 
expression is diminished as a result of alteration of wild-type APC 
genes. Provided with the APC coding sequence shown in Figure 7 (SEQ 
ID NO: 1), design of particular probes is well within the skill of the 
ordinary artisan. 

According to the present invention a method is also provided of 
supplying wild-type APC function to a cell which carries mutant APC 
alleles. Supplying such function should suppress neoplastic growth of 
the recipient cells. The wild-type APC gene or a part of the gene may 
be introduced into the cell in a vector such that the gene remains 
extrachromosomal. In such a situation the gene will be expressed by 
the cell from the extrachromosomal location. If a gene portion is 
introduced and expressed in a cell carrying a mutant APC allele, the 
gene portion should encode a part of the APC protein which is required 
for non-neoplastic growth of the cell. More preferred is the situation 
where the wild-type APC gene or a part of it is introduced into the 
mutant cell in such a way that it recomtrines with the endogenous 
mutant APC gene present in the cell. Such recombination requires a 
double recombination event which results in the correction of the APC 
gene mutation. Vectors for introduction of genes both for recombina- 
tion and for extrachromosomal maintenance are known in the art and 
any suitable vector may be used. Methods for introducing DNA into 
cells such as electroporation, calcium phosphate co-precipitation and 
viral transduction are known in the art and the choice of method is 
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within th competence of th routineer. Cells transformed with the 
wild-type APC gene can be used as model systems to study cancer 
remission and drug treatments which promote such remission. 

Similarly, cells and animals which carry a mutant APC allele can 
be used as model systems to study and test for substances which have 
potential as therapeutic agents. The cells are typically cultured 
epithelial cells. These may be isolated from individuals with APC 
mutations, either somatic or germline. Alternatively, the cell line can 
be engineered to carry the mutation in the APC allele. After a test 
substance is applied to the cells, the neoplastically transformed pheno- 
type of the cell will be determined. Any trait of neoplastically trans- 
formed cells can be assessed, including anchorage-independent growth, 
tumorigenidty in nude mice, invasiveness of cells, and growth factor 
dependence. Assays for each of these traits are known in the art. 

Animals for testing therapeutic agents can be selected after 
mutagenesis of whole animals or after treatment of germline cells or 
zygotes. Such treatments include insertion of mutant APC alleles, usu- 
ally from a second animal species, as well as insertion of disrupted 
homologous genes. Alternatively, the endogenous APC genets) of the 
animals may be disrupted by insertion or deletion mutation. After test 
substances have been administered to the animals, the growth of 
tumors must be assessed. If the test substance prevents or suppresses 
the growth of tumors, then the test substance is a candidate therapeu- 
tic agent for the treatment of FAP and/or sporadic cancers. 

Polypeptides which have APC activity can be supplied to cells 
which carry mutant or missing APC alleles. The sequence of the APC 
protein is disclosed in Figure 3 or 7 (SEQ ID NO: 7 or 1). These two 
sequences differ slightly and appear to be Indicate the existence of two 
different forms of the APC protein. Protein can be produced by 
expression of the cDNA sequence in bacteria, for example, using known 
expression vectors. Alternatively, APC can be extracted from APC- 
produdng m amm a lian cells such as brain cells. In addition, the tech- 
niques of synthetic chemistry can be employed to synthesize APC pro- 
tein. Any of such techniques can provide the preparation of the 
present Invention which comprises the APC protein. The preparation 
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is substantially free of other human proteins. This is most readily 
accomplished by synthesis in a microorganism or in vitro . 

Active APC molecules can be introduced into cells by 
microinjection or by use of liposomes, for example. Alternatively, * 
some such active molecules may be taken up by cells, actively or by 
diffusion. Extracellular application of APC gene product may be suffi- 
cient to affect tumor growth. Supply of molecules with APC activity 
should lead to a partial reversal of the neoplastic state. Other mole- 
cules with APC activity may also be used to effect such a reversal, for 
example peptides, drugs, or organic compounds. 

The present invention also provides a preparation of antibodies 
immunoreactive with a human APC protein. The antibodies may be 
polyclonal or monoclonal and may be raised against native APC pro- 
tein, APC fusion proteins, or mutant APC proteins. The antibodies 
should be immunoreactive with APC epitopes, preferably epitopes not 
present on other human proteins. In a preferred embodiment of the 
invention the antibodies will immunoprecipitate APC proteins from 
solution as well as react with APC protein on Western or immunoUots 
of polyacrylamide gels. In another preferred embodiment, the antibod- 
ies will detect APC proteins in paraffin or frozen tissue sections, using 
immunocytochemical techniques. Techniques for raising and purifying 
antibodies are well known in the art and any such techniques may be 
chosen to achieve the preparation of the invention. 

Predisposition to cancers as in FAP and GS can be ascertained 
by testing any tissue of a human for mutations of the APC gene. For 
example, a person who has inherited a germllne APC mutation would be 
prone to develop cancers. This can be determined by testing DNA from 
any tissue of the person's body. Most simply, blood can be drawn and 
DNA extracted from the cells of the blood. In addition, prenatal diag- 
nosis can be accomplished by testing fetal cells, placental cells, or * 
amniotic fluid for mutations of the APC gene. Alteration of a wild- 
type APC allele, whether for example, by point mutation or by dele- < 
tlon, can be detected by any of the means discussed above. 

Molecules of cDNA according to the present Invention are 
intron-f ree, APC gene coding molecules. They can be made by reverse 
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transcriptase using th APC mRNA as a template. These molecules 
can be propagated in vectors and cell lines as is known in the art. Such 
molecules have the sequence shown in SEQ ID NO: 7. The cDNA can 
also be made using the techniques of synthetic chemistry given the 
sequence disclosed herein. 

A short region of homology has been identified between APC and 
the human m3 muscarinic acetylcholine receptor (mAChR). This 
homology was largely confined to 29 residues in which 6 out of 7 amino 
acids (EL(GorA)GLQA) were identical (See Figure 4). Initially, it was 
not known whether this homology was significant, because many other 
proteins had higher levels of global homology (though few had six out of 
seven contiguous amino adds in common). However, a study on the 
sequence elements controlling G protein activation by mAChR subtypes 
(Leehleiter et aL, EMBO J., p. 4381 (1990)) has shown that a 21 amino 
acid region from the mS mAChR completely mediated G protein speci- 
ficity when substituted for the 21 amino acids of m2 mAChR at the 
analogous protein position. These 21 residues overlap the 19 amino acid 
homology between APC and m3 mAChR. 

This connection between APC and the G protein activating 
region of mAChR is intriguing in light of previous investigations relat- 
ing G proteins to cancer. For example, the RAS oncogenes, which are 
often mutated in colorectal cancers (Vogelstein, et aL, N. Engl. J. 
Med., Vol. 319, p. 525 (1988); Bos et aL, Mature Vol. 327, p. 293 (1987)), 
are members of the G protein family (Bourne, et aL v Nature, Vol. 348, 
p. 125 (1990)) as is an in vitro transformation suppressor (Noda et aL, 
Proc. Natl. Acad. ScL USA, Vol. 86, p. 162 (1989)) and genes mutated in 
hormone producing tumors (CancHs et aL, Nature, Vol. 340, p. 692 
(1989); Lyons et aL, Science, Vol. 249, p. 655 (1990)). Additionally, the 
gene responsible for neurofibromatosis (presumably a tumor suppressor 
gene) has been shown to activate the GTPase activity of RAS (Xu et aL, 
Cell, VoL 63, p. 835 (1990); Martin et aL, Cell, Vol. 63, p. 843 (1990); 
Ballester et aL, Cell, Vol 63, p. 851 (1990)). Another Interesting link 
between G proteins and colon cancer involves the drug sulindac. This 
agent has been shown to inhibit the growth of benign colon tumors in 
patients with FAP, presumably by virtue of its activity as a 
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cyclooxygenase inhibitor (Waddell et aL, J, Surg. Oncology 24(1), 83 
(1983); Wadeli, et al M Am. J. Surg., 157(1), 175 (1989); Charneau et aL, 
Gastroenterologie Clinique at Biologlque 14(2) t 153 (1990)). 
Cyclooxygenase is required to convert arachidonic acid to 
prostaglandins and other biologically active molecules. G proteins are 
known to regulate phospholipase A2 activity, which generates 
arachidonic acid from phospholipids (Role et al M Proc. Natl. Acad. Sci. 
USA, VoL 84, p. 3623 (1987); Kurachi et aL, Nature, Vol. 337, 12 555 
(1989)). Therefore we propose that wild-type APC protein functions by 
interacting with a G protein and is involved in phospholipid 
metabolism. 

The following are provided for exemplification purposes only and 
are not intended to limit the scope of the invention which has been 
described in broad terms above. 
Example 1 ; 

This example demonstrates the isolation of a 5.5 Mb region of 
human DNA linked to the FAP locus. Six genes are identified in this 
region, all of which are expressed in normal colon cells and in 
colorectal, lung, ad bladder tumors. 

The cosmid markers TN5.64 and YN5.48 have previously been 
shown to delimit an 8 cM region containing the locus for FAP 
(Nakamura et aL, Am. J. Hum. Genet. VoL 43, p. 638 (1988)). Further 
linkage and pulse-field gel electrophoresis (PFGE) analysis with addi- 
tional markers has shown that the FAP locus is contained within a 4 cM 
region bordered by cosmids EF5.44 and L5.99. In order to isolate clones 
representing a significant portion of this locus, a yeast artificial chro- 
mosome (TAG) library was screened with various 5q21 markers. 
Twenty-one YAC clones, distributed within six contigs and including 
5.5 Mb from the region between YN5.64 and YN5.48, were obtained 
(Figure 1A). 

Three contigs encompassing approximately 4Mb were contained 
within the central portion of this region. The YACs constituting these 
contigs, together with the markers used for their isolation and orienta- 
tions, are shown in Figure 1. These YAC contigs were obtained in the 
following way. To initiate each contig, the sequence of a genomic 
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marker cloned from chromosome 5q21 was determined and used to 
design primers for PCR. PCR was then carried out on pools of YAC 
clones distributed in microtiter trays as previously described (Anand 
et aL, Nucleic Acids Research, Vol. 18, p. 1951 (1980)). Individual YAC 
clones from the positive pools were identified by further PCR or 
hybridization based assays, and the YAC sizes were determined by 
PFGE. 

To extend the areas covered by the original YAC clones, "chro- 
mosomal walking" was performed. For this purpose, YAC termini were 
isolated by a PCR based method and sequenced (Riley et aL, Nucleic 
Acids Research, Vol; 18, p. 2887 (1990)). PCR primers based on these 
sequences were then used to rescreen the YAC library. For example, 
the sequence from an intron of the FER gene (Hao et aL, Mol. Cell. 
Biol., VoL 9, p. 1587 (1989)) was used to design PCR primers for isola- 
tion of the 28EC1 and 5EH8 YACs. The termini of the 28EC1 YAC 
were sequenced to derive markers RHE28 and LHE28, respectively. 
The sequences of these two markers were then used to isolate YAC 
clones 15CH12 (from RHE28) and 40CF1 and 29EF1 (from LHE28). 
These five YACs formed a eontig encompassing 1200 kb (contig l, 
Figure IB). 

Similarly, contig 2 was initiated using cosmid N5.66 sequences, 
and contig 8 was initiated using sequences both from the MCC gene and 
from cosmid EF5.44. A walk in the telomeric direction from YAC 
14FH1 and a walk In the opposite direction from YAC 39GG3 allowed 
connection of the initial contig 3 clones through YAC 37HG4 
(Figure IB). 

Multipoint linkage analysis with the various markers used to 
define the contigs, combined with PFGE analysis, showed that contigs l 
and 2 were centromeric to contig 3. These contigs were used as tools 
to orient and/or identify genes which might be responsible for FA P. 
Six genes were found to lie within this cluster of YACs, as follows: 

Contig #1: FER - The FER gene was discovered through its 
homology to the viral oncogene ABL (Hao et aL, supra ). It has an 
intrinsic tyrosine kinase activity, and in situ hybridization with an FER 
probe showed that the gene was located at 5qll-23 (Morris et aL, 



WO 92/13103 



-20- 



PCT/US92/00376 



Cytogen t. Cell. Genet., Vol. 53, p. 4, (1990)). Because of the pot ntlal 
role of this oncogene-related gene in neoplasia, we decided to evaluate 
it further with regards to the FAP locus. A human genomic clone from 
FER was isolated (MF 2.3) and used to define a restriction fragment 
length polymorphism (RFLP), and the RFLP in turn used to map FER by 
linkage analysis using a panel of three generation families. This 
showed that FER was very tightly linked to previously defined 
polymorphic markers for the FAP locus. The genetic mapping of FER 
was complemented by physical mapping using the YAC clones derived 
from FER sequences (Figure IB). Analysis of YAC contig l showed that 
FER was within 600 kb of cosmid marker M5.28, which maps to within 
1.5 Mb of cosmid LS.99 by PFGE of human genomic DNA. Thus, the 
YAC mapping results were consistent with the FER linkage data and 
PFGE analyses. 

Contig 2: TBI - TBI was identified through a cross-hybridization 
approach. Exons of genes are often evolutionarily conserved while 
introns and intergenic regions are much less conserved. Thus, if a 
human probe cross-hybridizes strongly to the DNA from non-primate 
species, there is a reasonable chance that it contains exon sequences. 
Subclones of the cosmids shown in Figure 1 were used to screen South- 
ern blots containing rodent DNA samples. A subclone of cosmid N5.66 
(p 5.66-4) was shown to strongly hybridize to rodent DNA, and this 
clone was used to screen cDNA libraries derived from normal adult 
colon and fetal liver. The ends of the initial cDNA clones obtained in 
this screen were then used to extend the cDNA sequence. Eventually, 
11 cDNA clones were isolated, covering 2314 bp. The gene detected by 
these clones was named TBI. Sequence analysis of the overlapping 
clones revealed an open reading frame (ORF) that extended for 1302 bp 
starting from the most 5' sequence data obtained (Figure 2A). If this 
entire open reading frame were translated, it would encode 434 amino - 
adds. The product of this gene was not globally homologous to any 
other sequence in the current database but showed two significant local < 
similarities to a family of ADP, ATP carrier/translocator proteins and 
mitochondrial brown fat uncoupling proteins which are widely distrib- 
uted from yeast to mammals. These conserved regions of TBI 
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(underlined In Figure 2A) may defin a predictiv motif for this 
sequence family. In addition, TBI appeared to contain a signal peptide 
(or mitochondrial targeting sequence) as well as at least 7 
transmembrane domains. 

Contig 3: MCC, TB2, SRP and APC - The MCC gene was also 
discovered through a cross-hybridization approach, as described previ- 
ously (Kinzler et al., Science VoL 251, p. 1366 (1991)). The MCC gene 
was considered a candidate for causing FAP by virtue of its tight 
genetic linkage to FAP susceptibility and its somatic mutation in spo- 
radic colorectal carcinomas. However, mapping experiments suggested 
that the coding region of MCC was approximately 50 kb proximal to 
the centromeric end of a 200 kb deletion found in an FAP patient. 
MCC cDNA probes detected a 10 kb mRNA transcript on Northern blot 
analysis of which 4151 bp, including the entire open reading frame, 
have been cloned. Although the 3' non-translated portion or an alter- 
natively spliced form of MCC might have extended into this deletion, it 
was possible that the deletion did not affect the MCC gene product. 
We therefore used MCC sequences to initiate a YAC contig, and subse- 
quently used the YAC clones to identify genes 50 to 250 kb distal to 
MCC that might be contained within the deletion. 

In a first approach, the Insert from YAC24ED6 (Figure IB) was 
radiolabeled and hybridized to a cDNA library from normal colon. One 
of the cDNA clones (YS39) identified in this manner detected a 3.1 kb 
mRNA transcript when used as a probe for Northern blot hybridization. 
Sequence analysis of the YS39 clone revealed that it encompassed 2283 
nucleotides and contained an ORF that extended for 555 bp from the 
most 5' sequence data obtained. If all of this ORF were translated, it 
would encode 185 amino acids (Figure 2B). The gene detected by YS39 
was named TB2. Searches of nucleotide and protein databases revealed 
that the TB2 gene was not identical to any previously reported 
sequences nor were there any striking similarities. 

Another clone (YS11) identified through the YAC 24ED6 screen 
appeared to contain portions of two distinct genes. Sequences from 
one end of YS11 were identical to at least 180 bp of the signal recogni- 
tion particle prot in SRP19 (Lingelbach t aL Nucleic Adds Research, 
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Vol. 16 t p. 9431 (1988). A second ORF, from the opposite nd of clone 
YS11, proved to be identical to 78 bp of a novel gene which was inde- 
pendently identified through a second YAC-based approach. For the 
latter, DNA from yeast cells containing YAC 14FH1 (Figure IB) was 
digested with EcoRI and subclones Into a plasmid vector, Plasmids that 
contained human DNA fragments were selected by colony hybridization 
using total human DNA as a probe. These clones were then used to 
search for cross-hybridizing sequences as described above for TBI, and 
the cross-hybridizing clones were subsequently used to screen cDNA 
libraries. One of the cDNA clones discovered in this way (FH38) con- 
tained a loi% ORF (2496 bp), 78 bp of which were identical to the 
above-noted sequences in YSll. The ends of the FH38 cDNA clone 
were then used to initiate cDNA walking to extend the sequence. 
Eventually, 85 cDNA clones were isolated from normal colon, brain and 
liver cDNA libraries and found to encompass 8973 nucleotides of con- 
tiguous transcript. The gene corresponding to this transcript was 
named APC. When used as probes for Northern Mot analysis, APC 
cDNA clones hybridized to a single transcript of approximately 9.5 kb, 
suggesting that the great majority of the gene product was represented 
in the cDNA clones obtained. Sequences from the 5 1 end of the APC 
gene were found in YAC 37HG4 but not In YAC 14FH1. However, the 
3 1 end of the APC gene was found in 14FH1 as well as 37HG4. The 
yeast artificial chromosome of the present Invention designated 
YAC 37HG4 has been deposited with the National Collection of Indus- 
trial and Marine Bacteria (NCIMB), P.O. Box 31, 135 Abbey Road, 
Aberdeen AB9 8DG, Scotland, prior to the filing of this patent applica- 
tion. The NCIMB Accession Number of YAC clone YAC 37HG4 is 
40353. Analogously, the 5' end of the MCC coding region was found in 
YAC clones 19AA9 and 26GC3 but not 24ED6 or 14FH1, while the 3 1 
end displayed the opposite pattern. Thus, MCC and APC transcription 
units pointed in opposite directions, with the direction of transcription 
going from centromeric to telomeric in the case of MCC, and telomeric 
to centromeric in the case of APC. PFGE analysis of YAC DNA 
digested with various restriction endonucleases showed that TB2 and 
SRP were between MCC and APC, and that the 3* ends of the coding 
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regi ns of MCC and APC were separated by approximately 150 kb 
(Figure IB), 

Sequence analysis of the APC cDNA clones revealed an open 
reading frame of 8,535 nucleotides. The 5' end of the ORF contained a 
methionine codon (codon 1) that was preceded by an in-frame stop 
codon 9 bp upstream, and the 3 f end was followed by several in-frame 
stop codons. The protein produced by initiation at codon 1 would con- 
tain 2,842 amino acids (Figure 3). The results of database searching 
with the APC gene product were quite complex due to the presence of 
large segments with locally biased amino add compositions. In spite of 
this, APC could be roughly divided into two domains- The N-terminal 
25% of the protein had a high content of leucine residues (12%) and 
showed local sequence similarities to myosins, various intermediate 
filament proteins (e.g., desmin, vimentin, neurofilaments) and 
Drosophila armadillo/human plakoglobin. The latter protein is a com- 
ponent of adhesive Junctions (desmosomes) Joining epithelial cells 
(Franke et aL, Proc. Natl. Acad. Sci. U.S.A., Vol. 86, p. 4027 (1989); 
Perfer et al., Cell, Vol. 63, p. 1167 (1990)) The C-terminal 75% of APC 
(residues 731-2832) Is 17% serine by composition with serine residues 
more or less uniformly distributed. This large domain also contains 
local concentrations of charged (mostly acidic) and proline residues. 
There was no indication of potential signal peptides, transmembrane 
regions, or nuclear targeting signals in APC, suggesting a cytoplasmic 
localization. 

To detect short similarities to APC, a database search was per- 
formed using the PAM-40 matrix (Altschul. J. Mol. Bio., Vol. 219, p. 555 
(1991). Potentially interesting matches to several proteins were found. 
The most suggestive of these involved the ral2 gene product of yeast, 
which is implicated in the regulation of ras activity (Fukul et al., Mol. 
Cell. Biol., VoL 9, p. 5617 (1989)). Little is known about how ral2 might 
interact with ras but it is interesting to note the positively-charged 
character of this region in the context of the negatively-charged GAP 
interaction region of ras. A specific electrostatic interaction between 
ras and GAP-related proteins has been proposed. 
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Because of the proximity of the MCC and APC genes, and the 
fact that both are implicated in colorectal tumorigenesis, we searched 
for similarities between the two predicted proteins. Bourne has previ- 
ously noted that MCC has the potential to form alpha helical coiled 
coils (Nature, Vol. 351, p. 188 (1991). Lupas and colleagues have 
recently developed a program for predicting coiled coil potential from 
primary sequence data (Science, Vol. 252, p. 1162 (1991) and we have 
used their program to analyze both MCC and APC. Analysis of MCC 
indicated a discontinuous pattern of coiled-coil domains separated by 
putative "hinge" or "spacer" regions similar to those seen in lamlnin 
and other intermediate filament proteins. Analysis of the APC 
sequence revealed two regions in the N-terminal domain which had 
strong coiled coil-forming potential, and these regions corresponded to 
those that showed local similarities with myosin and IF proteins on 
database searching. In addition, one other putative coiled coil region 
was identified in the central region of APC. The potential for both 
APC and MCC to form coiled coils is interesting in that such structures 
often mediate homo- and hetero-oligomerization. 

Finally, it had previously been noted that MCC shared a short 
similarity with the region of the m3 muscarinic acetylcholine receptor 
(mAChR) known to regulate specificity of G-protein coupling. The 
APC gene also contained a local similarity to the region of the m3 
mAChR that overlapped with the MCC similarity (Figure 4B). Although 
the similarities to ral2 (Figure 4A) and m3 mAChR (Figure 4B) were not 
statistically significant, they were intriguing in light of previous obser- 
vations relating G-proteins to neoplasia. 

Each of the six genes described above was expressed in normal 
colon mucosa, as indicated by their representation in colon cDNA 
libraries. To study expression of the genes in neoplastic colorectal 
epithelium, we employed reverse transcription-polymerase chain reac- 
tion (PCR) assays. Primers based on the sequences of FER, TBI, TB2, 
MCC, and APC were each used to design primers for PCR performed 
with cDNA templates. Each of these genes was found to be expressed 
in normal colon, in each of ten cell lines derived from colorectal can- 
cels, and in tumor cell lines derived from lung and bladder tumors. The 
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ten colorectal cancer cell lines included eight from patients with spo- 
radic CRC and two from pati nts with FA P. 
Example 2 

This example demonstrates a genetic analysis of the role of the 
FER gene in FAP and sporadic colorectal cancers. 

We considered FER as a candidate because of its proximity to 
the FAP locus as Judged by physical and genetic criteria (see 
Example 1), and its homology to known tyrosine kinases with oncogenic 
potential. Primers were designed to PCR-amplify the complete coding 
sequence of FER from the RNA of two colorectal cancer cell lines 
derived from FAP patients. cDNA was generated from RNA and used 
as a template for PCR. The primers used were 
5«-AGAAGGATCCCTTGTGCAGTGTGGA-3« and 
S'-GACAGGATCCTGAAGCTGAGTTTG-S*. The underlined nucleotides 
were altered from the true FER sequence to create BamHI sites. The 
cell lines used were JW and Difi, both derived from colorectal cancers 
of FAP patients. (C. Paraskeva, B.G. Buckle, D. Sheer, C.B. Wigley, 
Int. J. Cancer 34, 49 (1984); M.E. Gross et aL, Cancer Res. 51, 1452 
(1991). The resultant 2554 basepair fragments were cloned and 
sequenced in their entirety. The PCR products were cloned in the 
Bamm site of Bluescript SK (Stratagene) and pools of at least 50 clones 
were sequenced en masse using T7 polymerase, as described in Nigro 
et al., Nature 842, 705 (1989). 

Only a single conservative amino acid change (GTG->CTG, cre- 
ating a val to leu substitution at codon 439) was observed. The region 
surrounding this codon was then amplified from the DNA of individuals 
without FAP and this substitution was found to be a common 
polymorphism, not specifically associated with FAP. Based on these 
results, we considered it unlikely (though still possible) the FER gene 
was responsible for FAP. To amplify the regions surrounding codon 
439, the following primers were used: 5«-TCAGAAAGTGCTGAAGAG-3' 
and 5-GGAATAATTAGGTCTCCAA-3-. PCR products were digested 
with PstI, which yields a 50 bp fragment if codon 439 is leucine, but 26 
and 24 bp fragments if it is valine. The primers used for sequencing 
were chosen from the FER cDNA sequence in Hao et aL, supra . 
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Example 3 

This example demonstrates the genetic analysis of MCC, TB2, 
SRP and APC in FAP and sporadic colorectal tumors. Each of these 
genes is linked and encompassed by contig 3 (see Figure 1). 

Several lines of evidence suggested that this contig was of par- 
ticular interest. First, at least three of the four genes in this contig 
were within the deleted region identified in two FAP patients. (See 
Example 5 infra.) Second, allelic deletions of chromosome 5q21 in spo- 
radic cancers appeared to be centered in this region. (Ashton-Rickardt 
et al M Oncogene, in press; and Mild et al., Japn. J. Cancer Res., in 
press.) Some tumors exhibited loss of proximal RFLP markers (up to 
and potentially including the 5 1 end of MCC), but no loss of markers 
distal to MCC. Other tumors exhibited loss of markers distal to and 
perhaps including the 3* end of MCC, but no loss of sequences proximal 
to MCC. This suggested either that different ends of MCC were 
affected by loss in all such cases, or alternatively, that two genes (one 
proximal to and perhaps including MCC, the other distal to MCC) were 
separate targets of deletion. Third, clones from each of the six FAP 
region genes were used as probes on Southern blots containing tumor 
DNA from patients with sporadic CRC. Only two examples of somatic 
changes were observed in over 200 tumors studied: a 
rearrangement/deletion whose centromeric end was located within the 
MCC gene (Kinzler et al«, supra) and an 800 bp insertion within the 
APC gene between nucleotides 4424 and 5584. Fourth, point mutations 
of MCC were observed in two tumors (Kinzler et aU supra strongly 
suggesting that MCC was a target of mutation in at least some sporadic 
colorectal cancers. 

Based on these results, we attempted to search for subtle alter- 
ations of contig 3 genes in patients with FAP. We chose to e xamine 
MCC and APC, rather than TB2 or SRP, because of the somatic muta- 
tions in MCC and APC noted above. To facilitate the identification of 
subtle alterations, the genomic sequences of MCC and APC exons were 
determined (see Table I). These sequences were used to design primers 
for PCR analysis of constitutional DNA from FAP patients. 
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W first amplified eight exons and surrounding introns of the 
MCC g ne in affected individuals from 90 different FAP kindreds. The 
PCR products were analyzed by a ribonuclease (RNase) protein assay. 
In brief, the PCR products were hybridized to in vitro transcribed RNA 
probes representing the normal genomic sequences. The hybrids were 
digested with RNase A, which can cleave at single base pair mis- 
matches within DNA-RNA hybrids, and the cleavage products were 
visualized following denaturing gel electrophoresis. Two separate 
RNase protection analyses were performed for each exon, one with the 
sense and one with the antisense strand. Under these conditions, 
approximately 40% of all mismatches are detectable. Although some 
amino acid variants of MCC were observed in FAP patients, all such 
variants were found in a small percentage of normal individuals. These 
variants were thus unlikely to be responsible for the inheritance of 
FAP. 

We next examined three exons of the APC gene. The three 
exons examined included those containing nt 822-930, 931-1309, and 
the first 300 nt of the most distal exon (nt 1956-2256). PCR and RNase 
protection analysis were performed as described in Kinzler et al. supra , 
using the primers underlined in Table I. The primers for nt 1956-2256 
were 5 t -GCAAATCCTAAGAGAGAACAA-3' and 

5"-GATGGCAAGCTTGAGCCAG-3\ 

In 90 kindreds, the RNase protection method was used to screen 
for mutations and in an additional 13 kindreds, the PCR products were 
cloned and sequenced to search for mutations not detectable by RNase 
protection* PCR products were cloned into a Bluescript vector modi- 
fied as described in T.A. Holton and M.W. Graham, Nucleic Acids Res. 
19, 1156 (1991). A minimum of 100 clones were pooled and sequenced. 
Five variants were detected among the 103 kindreds analyzed. Cloning 
and subsequent DNA sequencing of the PCR product of patient P21 
indicated a C to T transition in codon 413 that resulted in a change 
from arginine to cysteine. This amino acid variant was not observed in 
any of 200 DNA samples from individuals without FAP. Cloning and 
sequencing of the PCR product from patients P24 and P34, who demon- 
strated the same abnormal RNase protection pattern indicated that 
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both had a C to T transition at codon 301 that resulted in a change 
from arginine (CGA) to a stop codon (TGA). This change was not 
present in 200 individuals without FAP. As this point mutation resulted 
in the predicted loss of the recognition site for the enzyme Tag I, 
appropriate PCR products could be digested with Taq I to detect the 
mutation. This allowed us to determine that the stop codon 
co-segregated with disease phenotype in members of the family of P24. 
The inheritance of this change in affected members of the pedigree 
provides additional evidence for the importance of the mutation. 

Cloning and sequencing of the PCR product from FAP patient 
P9J Indicated a C to G transversion at codon 279, also resulting in a 
stop codon (change from TCA to TGA). This mutation was not present 
in 200 individuals without FAP. Finally f one additional mutation result- 
ing in a serine (TCA) to stop codon (TGA) at codon 712 was detected in 
a single patient with FAP (patient P60). 

The five germline mutations identified are summarized in 
Table HA, as well as four others discussed in Example 9. In addition to 
these germline mutations, we identified several somatic mutations of 
MCC and APC in sporadic CRCs. Seventeen MCC exons were exam- 
ined in 90 sporadic colorectal cancers by RNase protection analysis. In 
each case where an abnormal RNase protection pattern was observed, 
the corresponding PCR products were cloned and sequenced. This led 
to the identification of six point mutations (two described previously) 
(Kinzler et aL, supra), each of which was not found in the germline of 
these patients (Table OB). Four of the mutations resulted in amino acid 
substitutions and two resulted in the alteration of splice site consensus 
elements, Mutations at analogous splice site positions in other genes 
have been shown to alter RNA processing in vivo and in vitro . 

Three exons of APC were also evaluated in sporadic tumors. 
Sixty tumors were screened by RNase protection, and an additional 98 
tumors were evaluated by sequencing. The exons examined Included nt 
822-930, 931-1309, and 1406-1545 (Table I). A total of three mutations 
were identified, each of which proved to be somatic. Tumor T27 con- 
tained a somatic mutation of CGA (arginine) to TGA (stop codon) at 
codon 33. Tumor T135 contained a GT to GC change at a splice donor 
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site. Tumor T34 contained a 5 bp insertion (CAGCC between codons 
288 and 289) resulting in a stop at codon 291 due to a f rameshlf t. 

We serendipitously discovered one additional somatic mutation in 
a colorectal cancer. During our attempt to define the sequences and 
splice patterns of the MCC and APC gene products in colorectal 
epithelial cells, we cloned cDNA from the colorectal cancer cell line 
SW480. The amino acid sequence of the MCC gene from SW480 was 
identical to that previously found in clones from human brain. The 
sequence of APC in SW480 cells, however, differed significantly, in 
that a transition at codon 1338 resulted in a change from glutamine 
(CAG) to a stop codon (TAG). To determine if this mutation was 
somatic, we recovered DNA from archival paraffin blocks of the origi- 
nal surgical specimen (T201) from which the tumor cell line was 
derived 28 years ago. 

DNA was purified from paraffin sections as described in S.E. 
Goelz, S.R. Hamilton, and B. Vogelstein. Biochem. Biophys. Res. 
Comm. 130, 118 (1985). PCR was performed as described in reference 
24, using the primers S'-GTTCCAGCAGTGTCACAG-S' and 
5 t -GGGAGATTTCGCTCCTGA-3\ A PCR product containing codon 
1338 was amplified from the archival DNA and used to show that the 
stop codon represented a somatic mutation present in the original pri- 
mary tumor and in cell lines derived from the primary and metastatic 
tumor sites, but not from normal tissue of the patient. 

The ten point mutations in the MCC and APC genes so far dis- 
covered in sporadic CRCs are summarized in Table nB. Analysis of the 
number of mutant and wild-type PCR dimes obtained from each of 
these tumors showed that in eight of the ten cases, the wild-type 
sequence was present in approximately equal proportions to the 
mutant. This was confirmed by RFLP analysis using flanking markers 
from chromosome 5q which demonstrated that only two of the ten 
tumors (T135 and T201) exhibited an allelic deletion on chromosome 5q. 
These results are consistent with previous observations showing that 
20-40% of sporadic colorectal tumors had allelic deletions of chromo- 
some 5q. Moreover, these data suggest that mutations of 5q2l genes 
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are not limited to those colorectal tumors which contain allelic dele- 
tions of this chromosome* 
Example! 

This example characterizes small, nested deletions in DNA from 
two unrelated FAP patients. 

DNA from 40 FAP patients was screened with cosmlds that had 
been mapped into a region near the APC locus to identify small dele- 
tions or rearrangements. Two of these cosmlds, L5.71 and L5.79, 
hybridized with a 1200 Kb NotI fragment in DNAs from most of the FAP 
patients screened. 

The DNA of one FAP patient, 3214, showed only a 940 kb NotI 
fragment instead of the expected 1200 kb fragment. DNA was ana- 
lyzed from four other members of the patient's immediate family; the 
940 kb fragment was present in her affected mother (4711), but not in 
the other, unaffected family members. The mother also carried a nor- 
mal 1200 kb NotI fragment that was transmitted to her two unaffected 
offspring. These observations indicated that the mutant polyposis 
allele is on the same chromosome as the 940 kb NotI fragment. A sim- 
ple interpretation is that APC patients 3214 and 4711 each carry a 260 
kb deletion within the APC locus. 

If a deletion were present, then other enzymes might also be 
expected to produce fragments with altered mobilities- Hybridization 
of L5.79 to Nrul-digested DNAs from both affected members of the 
family revealed a novel Nrul fragment of 1300 kb, in addition to the 
normal 1200 kb Nrul fragment. Furthermore, MM fragments in 
patients 3214 and 4711 also showed an increase in size consistent with 
the deletion of an Mlul site. The two chromosome 5 bomologs of 
patient 3214 were segregated in somatic cell hybrid lines; HHW1155 
(deletion hybrid) carried the abnormal homolog and HHW1159 (normal 
hybrid) carried the normal homolog. 

Because patient 3214 showed only a 940 kb NotI fragment, she 
had not inherited the 1200 kb fragment present in the unaffected 
father's DNA. This observation suggests that he must be heterozygous 
for, and have transmitted, either a deletion of the L5.79 probe region 
or a variant NotI fragment too large to resolve on the gel system. As 
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expected, the hybrid cell lin HHW1159, which carries the paternal 
homalog, revealed no resolved Not fragment when probed with L5.79. 
However, probing of HHWI159 DNA with L5.79 following digestion with 
other enzymes did reveal restriction fragments, demonstrating the 
presence of DNA homologous to the probe. The father is, therefore, 
interpreted as heterozygous for a polymorphism at the NotI site, with 
one chromosome 5 having a 1200 kb NotI fragment and the other hav- 
ing a fragment too large to resolve consistently on the gel. The latter 
was transmitted to patient 3214. 

When double digests were used to order restriction sites within 
the 1200 kb NotI fragment, L5.71 and L5.79 were both found to lie on a 
550 kb Notl-Nrul fragment and, therefore, on the same side of an Nrul 
site in the 1200 kb NotI fragment. To obtain genomic representation of 
sequences present over the entire 1200 kb NotI fragment, we con- 
structed a library of small-fragment inserts enriched for sequences 
from this fragment. DNA from the somatic cell hybrid HHW141, which 
contains about 40% of chromosome 5, was digested with NotI and 
electrophoresed under pulsed-field gel (PFG) conditions; EcoRI frag- 
ments from the 1200 kb region of this gel were cloned into a phage 
vector. Probe Map30 was isolated from this library. In normal individ- 
uals probe MapSO hybridizes to the 1200 kb NotI fragment and to a 200 
kb Nrul fragment. This latter hybridization places Map30 distal, with 
respect to the locations of L5.71 and L5.79, to the Nrul site of the 550 
kb Notl-Nrul fragment. 

Because MapSO hybridized to the abnormal, 1300 kb Nrul frag- 
ment of patient 3214, the locus defined by MapSO lies outside the 
hypothesized deletion. Furthermore, in normal chromosomes MapSO 
identified a 200 Id) Nrul fragment and L5.79 identified a 1200 kb Nrul 
fragment; the hypothesized deletion must, therefore, be removing an 
Nrul site, or sites, lying between MapSO and L5.79, and these two 
probes must flank the hypothesized deletion. A restriction map of the 
genomic region, showing placement of these probes, is shown in 
Figure 5. 

A NotI digest of DNA from another FAP patient, 3824, was 
probed with L5.79. In addition to the 1200 kb normal NotI fragment, a 
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fragment f approximately 1100 kb was observed, consistent with the 
presence of a 100 kb deletion in one chromosome 5. In this case, how- 
ever, digestion with Nrul and MM did not reveal abnormal bands, indi- 
cating that if a deletion were present, its boundaries must lie distal to 
the Nrul and MM sites of the fragments identified by L5.79. Consis- 
tent with this expectation, hybridization of MapSO to DNA from 
patient 3824 identified a 760 kb MM fragment in addition to the 
expected 860 kb fragment, supporting the interpretation of a 100 kb 
deletion in this patient. The two chromosome 5 homologs of patient 
3824 were segregated in somatic cell hybrid lines; HHW1291 was found 
to carry only the abnormal homolog and HHW1290 only the normal 
homolog. 

That the 860 kb MM fragment identified by MapSO is distinct 
from the 830 kb MM fragment identified previously by L5.79 was dem- 
onstrated by hybridization of MapSO and L5.79 to a Notl-MM double 
cflgest of DNA from the hybrid cell (HHW11S9) containing the 
nondeleted chromosome S homolog of patient 3214. As previously indi- 
cated, this hybrid is interpreted as missing one of the NotI sites that 
define the 1200 kb fragment A 620 kb Notl-MM fragment was seen 
with probe L5.79, and an 860 kb fragment was seen with MapSO. 
Therefore, the 830 kb MM fragment recognized by probe L5.79 must 
contain a NotI site in HHW1159 DNA; because the 860 kb MM fragment 
remains intact, It does not carry this NotI site and must be distinct 
from the 830 kb MM fragment* 

Example 5 

This example demonstrates the Isolation of human sequences 
which span the region deleted in the two unrelated FAP patients char- 
acterized in Example 4. 

A strong prediction of the hypothesis that patients 3214 and 
3824 carry deletions is that some sequences present on normal chromo- 
some 5 homologs would be missing from the hypothesized deletion 
homologs. Therefore, to develop genomic probes that might confirm 
the deletions, as well as to identify genes from the region, YAC clones 
from a contig seeded by cosmid L5.79 were localized from a library 
containing seven haploid human genom equivalents (Albertsen et aL v 
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Proc Natl. Acad. Sei. U.S.A., Vol. 87, pp. 4256-4260 (1990)) with 
respect to th hypothesized deletions. Three clones, YACs 57B8, 
310D8, and 183H12, were found to overlap the deleted region. 

Importantly, one end of YAC 57B8 (clone ATS7) was found to lie 
within the patient 3214 deletion. Inverse polymerase chain reaction 
(PCR) defined the end sequences of the insert of YAC 57B8. PCR 
primers based on one of these end sequences repeatedly failed to 
amplify DNA from the somatic cell hybrid (HHW1155) carrying the 
deleted homolog of patient 3214, but did amplify a product of the 
expected size from the somatic cell hybrid (HHW1159) carrying the 
normal chromosome 5 homolog. This result supported the interpreta- 
tion that the abnormal restriction fragments found in the ONA of 
patient 3214 result from a deletion. 

Additional support for the hypothesis of deletion in DNA from 
patient 3214 came from subcloned fragments of YAC 183H12, which 
spans the region in question. Yll, an EcoRI fragment cloned from 
YAC 183H12, hybridized to the normal, 1200 kb NotI fragment of 
patient 4711, but failed to hybridize to the abnormal, 940 kb NotI frag- 
ment of 4711 or to DNA from deletion cell line HHW1155. This result 
confirmed the deletion in patient 3214. 

Two additional EcoRI fragments from YAC 183H12, Y10 and 
Y14, were localized within the patient 3214 deletion by their failure to 
hybridizie to DNA from HHW1155. Probe Y10 hybridizes to a ISO kb 
Nrul fragment in normal chromosome 5 homologs. Because the 3214 
deletion creates the 1300 kb Nrul fragment seen with the probes L5.79 
and MapSO that flank the deletion, these Nrul sites and the 150 kb Nrul 
fragment lying between must be deleted in patient 3214. Furthermore, 
probe Y10 hybridizes to the same 620 kb Notl-Mlul fragment seen with 
probe L5.79 in normal DNA, indicating its location as L5.79-proximal to 
die deleted Mini site and placing it between the MM site and the 
L5.79-pradmal Nrul site. The Mlul site must, therefore, lie between 
the Nrul sites that define the 150 kb Nrul fragment (see Figure 5). 

Probe Yll also hybridized to the 150 kb Nrul fragment in the 
normal chromosome 5 homolog, but failed to hybridize to the 620 kb 
Notl-IOuI fragment, placing it L5.79-distal to the Mini site, but 
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proximal to the second Nrul site. Hybridization to th same (860 kb) 
Mlul fragment as Map30 confirmed the localization of probe Yll 
L5.79-distal to the Mlul site. 

Probe T14 was shown to be L5.79-distal to both deleted Nrul 
sites by virtue of its hybridization to the same 200 kb Nrul fragment of 
the normal chromosome 5 seen with MapSO. Therefore, the order of 
these EcoRI fragments derived from YAC 183H12 and deleted in 
patient 3214, with respect to L5.79 and Map30, is 
L5.79-Y10-Yll-Y14-Map30. 

The 100 kb deletion of patient 3824 was confirmed by the failure 
of aberrant restriction fragments in this DNA to hybridize with probe 
Yll, combined with positive hybridizations to probes Y10 and/or Y14. 
Y10 and Y14 each hybridized to the 1100 kb NotI fragment of patient 
3824 as well as to the normal 1200 kb NotI fragment, but Yll hybrid- 
ized to the 1200 kb fragment only* In the Mlul digest, probe Y14 
hybridized to the 860 kb and 760 kb fragments of patient 3824 DNA, but 
probe Yll hybridized only to the 860 kb fragment. We conclude that 
the basis for tile alteration in fragment size in DNA from patient 3824 
is, indeed, a deletion. Furthermore, because probes Y10 and Y14 are 
missing from the deleted 3214 chromosome, but present on the deleted 
3824 chromosome, and they have been shown to flank probe Yll, the 
deletion in patient 3824 must be nested within the patient 3214 
deletion. 

Probes Y10, Yll, Y14 and MapSO each hybridized to YAC 310D8, 
indicating that this YAC spanned the patient 3824 deletion and at a 
minimum , most of the 3214 deletion. The YAC characterizations, 
therefore* confirmed the presence of deletions in the patients and pro- 
vided physical representation of the del ete d region, 
gyampie ? 

This example demonstrates that the MCC coding sequence maps 
outside of the region deleted in the two FAP patients characterized in 
Example 4. 

An intriguing FAP candidate gene, MCC, recently was ascer- 
tained with cosmld L5.71 and was shown to have undergone mutation in 
colon carcinomas (Kinzler et aL, supra ). It was therefore of interest to 
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map this gene with respect to th deletions in APC patients. Hybrid- 
ization of MCC probes with an overlapping series of YAC clones 
extending in either direction from L5.71 showed that the 3 f end of MCC 
must be oriented toward the region of the two APC deletions. 

Therefore, two 3 f cDNA clones from MCC were mapped with 
respect to the deletions: clone 1CI (bp 2378*4181; and clone 7 (bp 
2890-3560). Clone 1CI contains sequences from the C-terminal end of 
the open reading frame, which stops at nucleotide 2708, as well as 3' 
untranslated sequence. Clone 7 contains sequence that is entirely 3 1 to 
the open reading frame. Importantly, the entire 3' untranslated 
sequence contained in the cDNA clones consists of a single 2.5 kb exon. 
These two clones were hybridized to DNAs from the YACs spanning the 
FAP region. Clone 7 fails to hybridize to YAC S10D8, although it does 
hybridize to YACs 183H12 and 57B8; the same result was obtained with 
the cDNA 1CI. Furthermore, these probes did show hybridization to 
DNAs from both hybrid cell lines (HWW1159 and HWW1155) and the 
lymphoblastoid cell line from patient 3214, confirming their locations 
outside the deleted region. Additional mapping experiments suggested 
that the $ end of the MCC cDNA clone contig is likely to be located 
more than 45 kb from the deletion of patient 3214 and, therefore, more 
than 100 kb from the deletion of patient 3824. 

FyamplA 7 

This example identifies three genes within the deleted region of 
chromosome 5 in the two unrelated FAP patients characterized in 
Example 4. 

Genomic clones were used to screen cDNA libraries in three 
separate experiments. One screening was done with a phage clone 
derived from YAC 310D8 known to span the 260 kb deletion of patient 
3214. A large-insert phage library was constructed from this YAC; 
screening with Yll identified X205, which mapped within both dele- 
tions. When clone X205 was used to probe a random*, plus oligo(dTK 
primed fetal brain cDNA library (approximately 300,000 phage), six 
cDNA clones were isolated and each of them mapped entirely within 
both deletions* Sequence analysis of these six clones formed a single 
cDNA contig, but did not reveal an extended open reading frame. One 
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of the six cDNAs was used to isolate more cDNA clones, som of which 
crossed the L5.71-proximal breakpoint of the 3824 deletion, as indi- 
cated by hybridization to both chromosome of this patient. These 
clones also contained an open reading frame, indicating a transcrip- 
tional orientation proximal to distal with respect to L5.71. This gene 
was named DPI (deleted in polyposis 1). This gene is identical to 1B2 
described above. 

cDNA walks yielded a cDNA contig of 3.0-3.5 kb, and included 
two clones containing terminal poly(A) sequences. This size corre- 
sponds lo the 3.5 kb band seen by Northern analysis. Sequencing of the 
first 3163 bp of the cDNA contig revealed an open reading frame 
extending from the first base to nucleotide 631, followed by a 2.5 kb 3' 
untranslated region. The sequence surrounding the methionine codon 
at base 77 conforms to the Kozak consensus of an initiation methionine 
(Kozak, 1984). Failed attempts to walk farther, coupled with the simi- 
larity of the lengths of isolated cDNA and mRNA, suggested that the 
NH2-terminus of the DPI protein had been reached. Hybridization to a 
combination of genomic and YAC DNAs cut with various enzymes incfi- 
cated the genomic coverage of DPI to be approximately 30 kb. 

Two additional probes for the locus, YS-11 and YS-39, which had 
been ascertained by screening of a cDNA library with an Independent 
YAC probe identified with MCC sequences adjacent to L5.71, were 
mapped into the deletion region. YS-39 was shown to be a cDNA iden- 
tical in sequence to DPI. Partial characterization of YS-11 had shown 
that 200 bp of DNA sequence at one end was identical to sequence cod- 
ing for the 19 kd protein of the ribosomal signal recognition particle, 
SRP19 (Lingelbach et aL, supra) . Hybridization experiments mapped 
YS-11 within both deletions. The sequence of this clone, however, was 
found to be complex. Although 454 bp of the 1032 bp sequence of 
YS-11 were identical to the GenBank entry for the SRP19 gene, 
another 578 bp appended 5' to the SRP19 sequence was found to consist 
of previously unreported sequence containing no extended open reading 
frames. This suggested that YS-11 was either a chimeric clone con- 
taining two independent inserts or a clone of an incompletely processed 
or aberrant message. If YS-11 were a conventional chimeric clone, the 
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Independent segments would not be expected to map to the same physi- 
cal region. The segm nts resulting from anomalous processing of a 
continuous transcript, however, would map to a single chromosomal 
region. 

Inverse PCR with primers specific to the two ends of YS-ll, the 
SRP19 end and the unidentified region, verified that both sequences 
map within the YAC 310D8; therefore, YS-ll is most likely a clone of 
an immature or anomalous mRNA species. Subsequently, both ends 
were shown to lie with the deleted region of patient 3824, and YS-ll 
was used to screen for additional cDNA clones. 

Of the 14 cDNA clones selected from the fetal brain library, one 
clone, V5, was of particular interest in that it contained an open read- 
ing frame throughout, although it included only a short Identity to the 
first 78 5' bases of the YS-ll sequence. Following the 78 bp of identi- 
cal sequence, the two cDNA sequences diverged at an AG. Further- 
more, divergence from genomic sequence was also seen after these 78 
bp, suggesting the presence of a splice junction, and supporting the 
view that YS-ll represents an irregular message. 

Starting with V5, successive 5' and 3' walks were performed; the 
resulting cDNA contig consisted of more than 100 clones, which 
defined a new transcript, DP2. Clones walking in the 5' direction 
crossed the 3824 deletion breakpoint farthest from L5.71; since its 3' 
end is closer to this cosmid than its 5' end, the transcriptional orienta- 
tion of DP2 is opposite to that of MCC and DPI. 

The third screening approach relied on hybridization with a 120 
kb MM fragment from YAC 57B8. This fragment hybridizes with probe 
Yll and completely spans the 100 kb deletion in patient 3824. the 
fragment was purified on two preparative PFGs, labeled, and used to 
screen a fetal brain cDNA library. A number of cDNA clones previ- 
ously identified in the development of the DPI and DP2 contigs were 
reaseertained. However, 19 new cDNA clones mapped into the patient 
3824 deletion. Analysis indicated that these 19 formed a new contig, 
DPS, containing a large open reading frame. 

A clone from the 5' end of this new cDNA contig hybridized to 
the same EcoRI fragment as the 3' end of DP2. Subsequently, the DP2 
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and DP3 contigs were connected by a single 5' walking step from DP3, 
to form the single contlg DP2.5. The complete nucleotide sequence of 
DP2.5 is shown In Figure 9. 

The consensus cDNA sequence of DP2.5 suggests that the entire 
coding sequence of DP2.S has been obtained and is 8532 bp long. The 
most 5* ATG codon occurs two codons from an in-frame stop and con- 
forms to the Kozak initiation consensus (Kozak, Nud. Acids. Res., 
Vol. 12, p. 857-872 1984). The 3' open reading frame breaks down over 
the final 1.8 kb, giving multiple stops in all frames. A poly(A) sequence 
was found in one done approximately l kb into the 3* untranslated 
region, associated with a polyadenylation signal 33 bp upstream (posi- 
tion 9530). The open reading frame is almost identical to that identi- 
fied as APC above. 

An alternatively spliced exon at nucleotide 934 of the DP2.5 
transcript is of potential interest, it was first discovered by noting 
that two classes of cDNA had been isolated. The more abundant cDNA 
class contains a 303 bp exon not included in the other. The presence in 
vivo of the two transcripts was verified by an exon connection experi- 
ment. Primers flanking the alternatively spliced exon were used to 
amplify, by PCR, cDNA prepared from various adult tissues. Two PCR 
products that differed in size by approximately 300 bases were ampli- 
fied from all the tissues tested; the larger product was always more 
abundant than the smaller. 

ffmmple ff 

This py?mpiA demonstrates the primers used to identify subtle 
mutations in DPI, SRP19, and DP25. 

To obtain DNA sequence adjacent to the exons of the genes DPI, 
DP2J, and SRP19, sequencing substrate was obtained by inverse PCR 
amplification of DNAs from two YACs, 310D8 and 183H12, that span 
the deletions. Ligation at low concentration cyclized the restriction 
enzyme-digested YAC DNAs. Oligonucleotides with sequencing tails, 
designed in inverse orientation at intervals along the cDNAs, primed 
PCR amplification from the cyclized templates. Comparison of these 
DNA sequent with the cDNA sequences placed exon boundaries at 
the divergence points. SRP19 and DPI were each shown to have five 
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exons. DP2.5 consisted of 15 exons, Th sequences of th 
oligonucleotides synthesized to provide PGR amplification primers for 
the exons of each of these genes are listed in Table m. With the excep- 
tion of exons 1, 3, 4, 9, and 15 of DP2.5 (see below), the primer 
sequences were located in intron sequences flanking the exons. The 5' 
primer of exon 1 is complementary to the cDNA sequence, but extends 
just into the 5* Kozak consensus sequence for the initiator methionine, 
allowing a survey of the translated sequences* The 5' primer of exon 3 
is actually in the 5' coding sequences of this exon, as three separate 
intronic primers simply would not amplify. The 5' primer of exon 4 just 
overlaps the 5' end of this exon, and we thus fall to survey the 19 most 
5' bases of this exon. For exon 9, two overlapping primer sets were 
used, such that each had one end within the exon. For exon 15, the 
large 3' exon of DP2.5, overlapping primer pairs were placed along the 
length of the exon; each pair amplified a product of 250-400 bases. 
Example 9 

This example demonstrates the use of single stranded conforma- 
tion polymorphism (SSCP) analysis as described by Orita et al. Proc. 
Natl. Acad. Sci. U.S.A., Vol. 86, pp. 2766-70 (1989) and Genomics, 
Vol. 5, pp. 874-879 (1989) as applied to DPI, SRP19 and DP2.5. 

SSCP analysis identifies most single- or multiple-base changes in 
DNA fragments up to 400 bases in length. Sequence alterations are 
detected as shifts in electrophoretic mobility of single-stranded DNA 
on nondenaturing acrylamide gels; the two complementary strands of a 
DNA segment usually resolve as two SSCP conformers of distinct 
mobilities. However, if the sample is from an individual heterozygous 
for a base-pair variant within the amplified segment, often three or 
more bands are seen. In some cases, even the sample from a 
homozygous individual will show multiple bands. Base-pair-change 
variants are identified by differences in pattern among the DNAs of 
the sample set. 

Exons of the candidate genes were amplified by PCR from the 
DNAs of 61 unrelated FAP patients and a control set of 12 normal indi- 
viduals. The five exons from DPI revealed no unique conformers in the 
FAP patients, although common conformers were observed with exons 
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2 and 3 in some individuals of both affected and control sets, indicating 
the presence of DNA sequenc polymorphisms. Likewise, none of th 
five exons of SRP19 revealed unique conformers in DNA from FAP 
patients in the test panel. 

Testing of exons 1 through 14 and primer sets A through N of 
exon 15 of the DP2.5 gene, however, revealed variant conformers spe- 
cific to FAP patients in exons 7, 8, 10, 11, and IS. These variants were 
in the unrelated patients 3746, 3460, 3827, 3712, and 3751, respectively. 
The PCR-SSCP procedure was repeated for each of these exons in the 
five affected individuals and in an expanded set of 48 normal controls. 
The variant bands were reproducible in the FAP patients but were not 
observed in any of the control DNA samples. Additional variant con- 
formers in exons 11 and 15 of the DP2.5 gene were seen; however, each 
of these was found in both the affected and control DNA sets. The five 
sets of conformers unique to the FAP patients were sequenced to 
determine the nucleotide changes responsible for their altered mobili- 
ties. The normal conformers from the host individuals were sequenced 
also. Bands were cut from the dried acrylamide gels, and the DNA was 
eluted. PCR amplification of these DNAs provided template for 
sequencing. 

The sequences of the unique conformers from exons 7, 8, 10, and 
11 of DP2.5 revealed dramatic mutations in the DP2.5 gene. The 
sequence of the new mutation creating the exon 7 conformer in patient 
3746 was shown to contain a deletion of two adjacent nucleotides, at 
positions 730 and 731 in the cDNA sequence (Figure 7). The normal 
sequence at this splice Junction is CAG GGTCA (intronic sequence 
underlined), with the intron-exon boundary between the two repetitions 
of AG. The mutant allele in this patient has the sequence CAGGTCA. 
Although tills change is at the 5' splice site, comparison with known 
consensus sequences of splice Junctions would suggest that a functional 
splice Junction is maintained. If this new splice Junction were func- 
tional, the mutation would introduce a frameshif t that creates a stop 
codon 15 nucleotides downstream. If the new splice Junction were not 
functional, messenger processing would be significantly altered. 
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To confirm the 2-base deletion, the PCR product from FAP 
patient 3746 and a control DNA were electrophoresed on an 
acrylamide-urea denaturing gel, along with the products of a sequenc- 
ing reaction. The sample from patient 3746 showed two bands differing 
in size by 2 nucleotides, with the larger band identical in mobility to 
the control sample; this result was independent confirmation that 
patient 3746 is heterozygous for a 2 bp deletion. 

The unique conformer found in exon 8 of patient 3460 was found 
to carry a C-T transition, at position 904 in the cDNA sequence of 
DP2.5 (shown in Figure 7), which replaced the normal sequence of CGA 
with TGA. This point mutation, when read in frame, results in a stop 
codon replacing the normal arginine codon. This single-base change 
had occurred within the context of a CG dimer, a potential hot spot for 
mutation (Barker et al., 1984). 

The conformer unique to FAP patient 3827 in exon 10 was found 
to contain a deletion of one nucleotide (1367, 1368, or 1369) when com- 
pared to the normal sequence found in the other bands on the SSCP gel. 
This deletion, occurring within a set of three Ps, changed the sequence 
from CTTTCA to CTTCA; this 1 base frameshift creates a downstream 
stop within 30 bases. The PCR product amplified from this patient's 
... DNA- also was electrophoresed on an acrylamide-urea denaturing gel, 
along with the PCR product from a control DNA and products from a 
sequencing reaction. The patient's PCR product showed two bands 
differing by 1 bp in length, with the larger identical in mobility to the 
PCR product from the normal DNA; this result confirmed the presence 
of a 1 bp deletion in patient 3827. 

Sequence analysis of the variant conformer of exon 11 from 
patient 3712 revealed the substitution of a T by a G at position 1500, 
changing the normal tyrosine codon to a stop codon. 

The pair of conformers observed in exon 15 of the DP2.5 gene 
for FAP patient 3751 also was sequenced. These conformers were 
found to carry a nucleotide substitution of C to G at position 5253, the 
third base of a valine codon. No amino add change resulted from this 
substitution, suggesting that this conformer reflects a genetically silent 
polymorphism. 
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The observation of distinct inactivating mutations in the DP2.5 
gene in four unrelated patients strongly suggested that DP2.5 is the 
gene involved in FAP. These mutations are summarized in Table HA. 
Example 10 

This example demonstrates that the mutations identified in the 

DP2.5 (APC) gene segregate with the FAP phenotype. 

Patient 3746, described above as carrying an APC allele with a 

frameshif t mutation, is an affected offspring of two normal parents. 

Colonoscopy revealed no polyps in either parent nor among the 

patients three siblings* 

DNA samples from both parents, from the patient's wife, and 

from their three children were examined. SSCP analysis of DNA from 

both of the patient's parents displayed the normal pattern of conform- 
ers for exon 7, as did DNA from the patients<s wife and one of his off- 
spring. The two other children, however, displayed the same new con- 
formers as their affected father. Testing of the patient and his parents 
with highly polymorphic VNTR (variable number of tandem repeat) 
markers showed a 99.98% likelihood that they are his biological 
parents. 

These observations confirmed that this novel conformer, known 
to reflect a 2 bp deletion mutation in the DP2.5 gene, appeared sponta- 
neously with FAP In this pedigree and was transmitted to two of the 
children of the affected individual. 

This example demonstrates polymorphisms in the APC gene 
which appear to be unrelated to disease (FAP). 

Sequencing of variant conf ormers found among controls as well 
as individuals with APC has revealed the following polymorphisms in 
the APC gene: first, in exon 11, at position 1458, a substitution of T to 
C creating an Rsal restriction site but no amino acid change; and sec- 
ond, in exon 15, at positions 5037 and 5271, substitutions of A to G and 
G to T, respectively, neither resulting in amino acid substitutions. 
These nucleotide polymorphisms in the APC gene sequence may be 
useful for diagnostic purposes. 
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Ryample 12 

This example shows the structure of the APC gene. 

The structure of the APC gene is schematically shown in 
Figure 8, with flanking intron sequences indicated. 

The continuity of the very large (6.5 kb), most 3* exon in DP2.S 
was shown in two ways. ?irst, inverse PCR with primers spanning the 
entire length of this exon revealed no divergence of the cDNA 
sequence from the genomic sequence. Second, PCR amplification with 
converging primers placed at intervals along the exon generated prod- 
ucts of the same size whether amplified from the originally isolated 
cDNA, cDNA from various tissues, or genomic template. Two forms of 
exon 9 were found in DP2.5: one is the complete exon; and the other, 
labeled exon 9A, is the result of a splice into the interior of the exon 
that deletes bases 934 to 1236 in the mRNA and removes 101 amino 
acids from the predicted protein (see Figure 7). 
Example 13 

This example demonstrates the mapping of the FAP deletions 
with respect to the APC exons. 

Somatic cell hybrids carrying the segregated chromosomes 5 
from the 100 kb (HHW1291) and 260 kb (HHW1155) deletion patients 
were used to determine the distribution of the APC genes exons across 
the deletions. DNAs from these cell lines were used as template, along 
with genomic DNA from a normal control, for PCR-based amplification 
of the APC exons. 

PCR analysis of the hybrids from the 260 kb deletion of patient 
3214 showed that all but one (exon 1) of the APC exons are removed by 
this deletion. PCR analysis of the somatic cell hybrid HHW1291, carry- 
ing the chromosome 5 homolog with the 100 kb deletion from patient 
3824, revealed that exons 1 through 9 are present but exons 10 through 
15 are missing. This result placed the deletion breakpoint either 
between exons 9 and 10 or within exon 10. 
Example 14 

This example demonstrates the expression of alternately spliced 
APC messenger in normal tissues and in cancer cell lines. 
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Tissues that express the APC gene were identified by PCR 
amplification of cDNA made to mRNA with primers located within 
adjacent APC exons. In addition, PCR primers that flank the alterna- 
tively spliced exon 9 were chosen so that the expression pattern of 
both splice forms could be assessed. All tissue types tested (brain, lung, 
aorta, spleen, heart, kidney, liver, stomach, placenta, and colonic 
mucosa) and cultured cell lines (lymphoblasts, HL60, and 
choriocarcinoma) expressed both splice forms of the APC gene. We 
note, however, that expression by lymphocytes normally residing in 
some tissues, including colon, prevents unequivocal assessment of 
expression. The large mRNA, containing the complete exon 9 rather 
than only exon 9A, appears to be the more abundant message. 

Northern analysis of poly(A)-selected RNA from lymphoblasts 
revealed a single band of approximately 10 kb, consistent with the size 
of the sequenced cDNA. 
Example 15 

This example discusses structural features of the APC protein 
predicted from the sequence. 

The cDNA consensus sequence of APC predicts that the longer, 
more abundant form of the message codes for a 2842 or 28444 amino 
acid peptide with a mass of 311.8 kd. This predicted APC peptide was 
compared with the current data bases of protein and DNA sequences 
using both Intelligenetics and GCG software packages. No genes with a 
high degree of amino acid sequence similarity were found. Although 
many short (approximately 20 amino acid) regions of sequence similar- 
ity were uncovered, none was suf ficently strong to reveal which, if 
any, might represent functional homology. Interestingly, multiple simi- 
larities to myosins and keratins did appear. The APC gene also was 
scanned for sequence motifs of known function; although multiple 
glycosylation, phosphorylation, and myristoylation sites were seen, 
their significance is uncertain. 

Analysis of the APC peptide sequence did identify features 
important in considering potential protein structure. Hydropathy plots 
(Kyte and Doolittle, J. Mol. BioL VoL 157, pp. 105-132 (1982)) indicate 
that the APC protein is notably hydrophilic. No hydrophobic domains 



WO 92/13103 



-45- 



PCT/US92/00376 



suggesting a signal peptide or a membrane-spanning domain were 
found. Analysis of the first 1000 residues indicates that a-helical rods 
may form (Cohen and Parry, Trends Biochem, Sci. VoL 77, pp. 245-248 
(1986); there is a scarcity of proline residues and, there are a number of 
regions containing heptad repeats (apolar-X-X-apolar-X-X-X). Inter- 
estingly, in exon 9A, the deleted form of exon % two heptad repeat 
regions are reconnected in the proper heptad repeat frame, deleting 
the intervening peptide region. After the first 1000 residues, the high 
proline content of the remainder of the peptide suggests a compact 
rather than a rod-like structure. 

The most prominent feature of the second 1000 residues is a 20 
amino acid repeat that is iterated seven times with semiregular spacing 
(Table 4). The intervening sequences between the seven repeat regions 
contained 114, 116, 151, 205, 107, and 58 amino acids, respectively. 
Finally, residues 2200-24000 contain a 200 amino acid basic domain. 
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(I) GENERAL INFORMATIONS 

(i) APPLICANT: ALBERTSEN, HANS 
ANAND, RAKESH 
CARLSON, MARY 
GRODEN, JOANNA 
HEDGE, PHILIP J. 
JOSLYN, GEOFF 
KINZLER, KENNETH 
MARXHAM, ALEXANDER F. 
NAKAMURA, YUSUKE 
THLIVERIS/ ANDREW 

(ii) TITLE OF INVENTION: INHERITED AND SOMATIC MUTATIONS OF APC 
GENE IN COLORECTAL CANCER IN HUMANS 

(ill) NUMBER OF SEQUENCES 1 94 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Banner, Birch, McJCle fi Beckett 

(B) STREET: 1001 G Street, NW 

(C) CITY: Washington 

(D) STATE: D.C. 
(S) COUNTRY: USA 
(F) ZIP: 20001-4598 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0/ Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/741,940 

(B) FILING DATE: 08-AUG-1991 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Regan, Sarah A. 

(B) REGISTRATION NUMBER: 32,141 

(C) REFERENCE /DOCKET NUMBER: 1107.035574 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 202-508-9100 

(B) TELEFAX: 202-508-9299 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9606 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(vii) IMMEDIATE SOURCE: 

(B) CLONE: DP2.5(APC) 

(ix) FEATURE*: 

(A) NAME/KEY: CDS 

(B) LOCATION: 34. .8562 



<xi) SEQUENCE DESCRIPTION: SEQ ID NOll: 
CGACTCGGAA ATGAGCTCCA AGCGTAGCCA ACG ATO GCT CCA OCT TCA TAT CAT 

1 5 

CAO TTC TTA AAC CAA CTT CAC CCA CTC AAO ATO CAC AAC TCA AAT CTT 
2 X £• !K Val Clu Al. L«u Ly. M.t Cl« A.n Ser A.n L.u 

10 15 •* 

. _ evP «»t tcC AAT CAT CTT ACA AAA CTC CAA ACT 

SS Si StS 21 S£ 2£ £ £ ^ *g «*• ^ «- ** 

25 30 35 

r*r cex TCT AAT ATC AAC CAA CTA CTT AAA CAA CTA CAA CCA ACT ATT 
K K IS £ S t£ Glu val L.u Ly. Cln Leu Cln Cly S.r II. 
40 45 so 

CAA CAT CAA OCT ATO OCT TCT TCT CCA CAC ATT CAT TTA TTA CAO COT 
Sp Sta aS Met Ala Ser Ser Cly Cln lie A.p Leu Leu Clu Arg 

60 65 

... « r e— mxc tta GAT AOC ACT AAT TTC CCT GGA GTA AAA CTC 
S 2 SS SJ S.r Ser A.n Ph. Fro Cly Val Ly. Leu 



75 



ecc TCA AAA ATC TCC CTC CCT TCT TAT CCA ACC CCC CAA CCA TCT CTA 
Jrt 12 J£ S £ Leu Arg S.r Tyr Cly S.r Ar, Clu Cly Ser Val 



90 



. rr mm tcT GGA GAG TGC ACT CCT CTT CCT ATG GGT TCA TTT CCA 
s2 Ser s2 £J SS ™ Sar Pro Val Pro K.t Oly S.r Phe Pro 

105 110 11 

ecc TTT CTA AAT GGA AGC AGA GAA ACT ACT GGA TAT TTA CAA 
S j££ S St En Xy S« Arg Glu S;r Thr Gly Tyr L.u Glu 
120 " 5 130 

r~r* r*r AAA GAG AGG TCA TTG CTT CTT GCT GAT CTT GAC AAA GAA 
Clu Leu Clu Ly. Clu Arg Ser Leu Leu Leu Ala A.p Leu A.p Ly. Clu 
140 " 5 

rAA AAA CAC TGG TAT TAC GCT CAA CTT CAG AAT CTC ACT AAA 
SJ iyi 2? S S Tyr Ala Cln Leu Cln A.n Lju Thr Ly. 

155 160 *•» 

*ea ATA CAT ACT CTT CCT TTA ACT CAA AAT TTT TCC TTA CAA ACA CAT 
S£ z5 £* £ £ lm Thr Clu A.n Ph. 8.r Leu Cln Thr A.p 



170 



TTG ACC ACA AGC CAA TTC CAA TAT CAA CCA ACC CAA ATC ACA CTT CCC 
Leu Thr Arg Arg Cln Leu Olu Tyr Glu Al. Arg Gin II. Arg Val Al. 

185 WO w 
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gAA CAA CTA GGT ACC TGC CAG GAT ATG CAA AAA CCA CCA CAG 678 
Ket Glu Glu Gin 2S cly Thr Cy. Gin Asp Met Glu I*. Ar, Ala Gin 

200 205 

f«r* tra ATA CCC AGA ATT GAG CAA ATC GAA AAG GAC ATA CTT CGT ATA 726 
Si K £ Si S Sn Gin He Glu Ly. A.p lie Leu Arg He 
220 225 

CGA CAG CTT TTA CAG TCC CAA GCA ACA GAA GCA GAG AGO TCA TCT CAG 774 
So §ia Eeu Leu Gin Ser Gin Ala Thr Glu Ala Glu Arg Ser Ser Gin 
9 235 240 245 

»AC AAG CAT GAA ACC GGC TCA CAT GAT GCT GAG CGG CAG AAT GAA GGT 822 
iyl His Si tS gT? Ser HI. Aap Ala Glu Ar, Gin A.n Glu Gly 

CAA GGA GTG GGA GAA ATC AAC ATG GCA ACT TCT GGT AAT GOT CAG GOT 
S5 Gly val Gly Glu II. A.n Mat Ala Thr Ser Gly A«n Gly Gin Gly 
265 270 2 

TCA ACT ACA CGA ATG GAC CAT GAA ACA GCC ACT GTT TTG ACT TCT AGT 918 
J2 tS 5S Sg Het An> Hi. Glu Thr Aim Ser Val Leu Ser Ser Ser 

280 285 

• - r CAC TCT GCA CCT CGA ACG CTG ACA AGT CAT CTG GGA ACC AAG 966 

Ser Thr Hie Ser Ala Pro Arg Arg Leu Thr Ser Bis Leu Gly Thr Ly. 

300 305 * 

CTG GAA ATG GTG TAT TCA TTG TTG TCA ATG CTT GGT ACT CAT GAT AAG 
53 Si MM Val Tyr Ser Leu Leu Ser Met Leu Gly Thr Hie A *P !*■ 

320 325 



315 



GAT GAT ATC TCO CGA ACT TTG CTA GCT ATG TCT AGC TCC CAA GAC AGC 
2? 2? Me? ier Arg Thr Leu Leu Ale Met Ser Ser Ser Gin Asp Ser 



330 



360 



GCT CGG GCC AGO GCC AGT GCA GCA CTC CAC AAC ATC ATT 
Ser Lye Olu Ala Arg Ala Arg Ala Ser Al* Ala Leu Hi. A.n II. II. 

380 385 jsfu 



870 



1014 



1062 



1110 



TGT ATA TCC ATG CGA CAG TCT GGA TGT CTT CCT CTC CTC ATC CAG CTT 
SI ?2 X? St Arg Gin Ser Gly Cy. Leu Pro Leu Leu lie Gin Leu 
345 350 355 

TTA CAT GGC AAT GAC AAA GAC TCT GTA TTG TTG GGA AAT TCC CGG GGC 1158 
Si SI Sy £n Asp Ly. Asp St Val Leu Leu Gly Aen Ser Arg Gly 



1206 



1254 



1302 



CAC TCA CAO CCT GAT GAC AAG AGA GGC AGC COT GAA ATC CGA CTC CTT 
Hie Ser Gin Pro Asp A»p Lye Arg Gly Arg Arg Glu lie Arg Vel Leu 
395 400 

CAT CTT TTG GAA CAO ATA COC GCT TAC TGT GAA ACC TGT TGO GAG TGG 
Hie Leu Leu Glu Gin lie Arg Ala Tyr Cy. Glu Thr Cy. Trp Glu Trp 

«c CAA GCT CAT GAA CCA GGC ATG GAC CAO GAC AAA AAT CCA ATG CCA 1350 
£5! 2i £5 £S Gl£ Pro Gly Mat Aep Gin A.p Ly. A.n Pro Met Pro 

425 *30 435 
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GCT CCT GTT GAA CAT CAG ATC TGT CCT OCT CTG TGT CTT CTA ATG AAA 1398 
Ala Pro Val lu His Gin He Cy« Pro Ala Val Cya Val Leu Met Lya 
440 445 «0 455 

CTT TCA TTT GAT GAA GAG CAT AGA CAT OCA ATG AAT GAA CTA GGG GGA 1446 
Leu Ser Phe Asp Glu Glu Hia Arg Hia Ala Met Asn Glu Leu Gly Gly 
460 465 470 

CTA CAG GCC ATT GCA GAA TTA TTG CAA CTG GAC TGT GAA ATG TAT GGG 1494 
Leu Gin Ala He Ala Glu Leu Leu Gin Val Asp Cys Glu Met Tyr Gly 
475 480 485 

CTT ACT AAT GAC CAC TAC AGT ATT ACA CTA AGA CGA TAT GCT GGA ATG 1542 
Leu Thr Asn Asp His Tyr Ser He Thr Leu Arg Arg Tyr Ala Gly Met 
490 495 500 

CCT TTG ACA AAC TTG ACT TTT GGA CAT GTA GCC AAC AAG GCT ACG CTA 1590 
Ala Leu Thr Asn Leu Thr Phe Gly Asp Val Ala Asn Lys Ala Thr Leu 
505 510 515 

TGC TCT ATG AAA GCC TGC ATG AGA GCA CTT GTG GCC CAA CTA AAA TCT 1638 
Cvs Ser Met Lys Gly Cys Met Arg Ala Leu Val Ala Gin Leu Lys Ser 
520 5?5 530 535 

GAA AGT CAA GAC TTA CAG CAG GTT ATT GCA AGT GTT TTG AGG AAT TTG 1686 
Glu Ser Glu Asp Leu Gin Gin Val He Ala Ser Val Leu Arg Asn Leu 
540 545 550 

TCT TGG CGA GCA GAT GTA AAT AGT AAA AAG ACG TTG CGA GAA GTT GGA 1734 
ser Trp Arg Ala Asp Val Asn Ser Lys Lys Thr Leu Arg Glu Val Gly 
555 560 565 

AGT GTG AAA GCA TTG ATG GAA TGT GCT TTA GAA CTT AAA AAG GAA TCA 1782 
Ser Val Lys Ala Leu Met Glu Cys Ala Leu Glu Val Lys Lys Glu Ser 
570 575 580 

ACC CTC AAA AGC GTA TTG AGT GCC TTA TGG AAT TTG TCA GCA CAT TGC 1830 
Thr Leu Lys Ser Val Leu Ser Ala Leu Trp Asn Leu Ser Ala His Cys 
585 590 595 

ACT GAG AAT AAA GCT GAT ATA TGT GCT GTA OAT GGT CCA CTT GCA TTT 1878 
Thr Glu Asn Lys Ala Asp He Cys Ala Val Asp Gly Ala Leu Ala Phe 
600 60S 610 615 

TTG GTT GGC ACT CTT ACT TAC CGG AGC CAG ACA AAC ACT TTA GCC ATT 1926 
Leu Val Gly Thr Leu Thr Tyr Arg Ser Gin Thr Asn Thr Leu Ala He 
620 525 630 

ATT GAA AGT GGA GGT GGG ATA TTA CGG AAT CTO TCC AGC TTG ATA GCT 1974 
He Glu Ser Gly Gly Gly He Leu Arg Asn Val Ser Ser Leu He Ala 
635 540 645 

ACA AAT GAG GAC CAC AGG CAA ATC CTA AGA GAG AAC AAC TGT CTA CAA 2022 
Thr Asn Glu Asp His Arg Gin He Leu Arg Glu Asn Asn Cys Leu Gin 
650 555 660 

ACT TTA TTA CAA CAC TTA AAA TCT CAT AGT TTG ACA ATA GTC AGT AAT 2070 
Thr Leu Leu Gin His Leu Lys Ser His Ser Leu Thr He Val Ser Asn 

665 570 675 



amT cTC TCA CCA AGA AAT CCT AAA CAC CAG 

S SI S3 £ 25 S £ 25 £5 ». jjg - — »• - £ 

680 685 
700 



»m ri~r »TC GGA ACT GCT GCA GCT TTA AGG 

SSSSKSSSEESiSS. 

715 720 

CCA AAT AGG CCT GCC AAG TAC AAG GAT GCC AAT ACT ATG 
S E£ IS S Al. Ly. Tyr Lye A.p Ala A.n II. Met 



AAT CTC ATG 
Asn Leu Met 
730 

TCT CCT GCC 
Ser Pro Gly 
745 

CTA GAA GCA 
Leu Glu Ala 
760 

ATA GAC AAT 
He Asp Afln 



AAG CAA ACT 
Lye Gin Ser 



GAT AAT AGG 
Asp Asn Arg 
810 

CCA TAT TTG 
Pro Tyr Leu 
825 

AGC TTA GAT 
Ser Leu Aep 
840 

CGC GGA ATT 
Arg Gly llm 



ACT TCT TCA 
Thr ser Ser 



GCC AAA GTC 
Ala Lye val 
890 

AGA AGT TCT 
Arg Ser Ser 
905 



tec TTG CCA TCT CTT CAT GTT AGG AAA CAA AAA GCC 
Ser Ser Leu Pro Ser Leu Hi. Val Arg Lye Gin Lye Ala 
750 755 
r&T CCT CAG CAC TTA TCA GAA ACT TTT GAC AAT 

!£ 2 Z 2 & u. gc Ol« Thr Ph. A.p A£ 

765 

TTA AGT CCC AAC CCA TCT CAT CCT ACT AAG CAC AGA CAC 

S 52 wS ty Al* s« Hi. Arg s.r Ly. Gin Arg Hi. 

780 785 

CTC - AT cgt OAT TAT GTT TTT GAC ACC AAT CCA CAT GAT 
2u SI J-P Tyr val Ph. A.p Thr A.a Arg Hi. A.p 

e»C AAT TTT AAT ACT GGC AAC ATG ACT GTC CTT TCA 
J2 A^ A«n SI 2n Thr Gly A«n H. t Thr V*l L.« S«r 

815 

ACA GTG TTA CCC AGC TCC TCT TCA TCA AGA GGA 
J£ £ i£ ?S S Pro ser S.r S« Ser Ser Arg Gly 
830 835 

AGT TCT CGT TCT GAA AAA GAT AGA AGT TTG GAG AGA GAA 
52 S S ST Clu ly. A.p Arg S«r Leu Glu Arg Glu 
S45 

nnff eTA CGC AAC TAC CAT CCA OCA ACA OAA AAT CCA GGA 
Gly 2u Gly A»n Tyr Hi. Pro Al. Thr Glu A«n Pro Gly 

860 865 
»»« COA GOT TTG CAG ATC TCC ACC ACT OCA GCC CAG ATT 
i£ £J 5u Oln gj S« Thr Thr Al. Al| Gin II. 
875 

» M «»* ism GTG TCA GCC ATT CAT ACC TCT CAG GAA GAC 
15 S£ Si £ S| S II. Hi. Thr Ser Gin Glu hmg 

„„- »ec ACT GAA TTA CAT TCT GTG ACA GAT GAG AGA 

£S 12 £ S «■ S Hi. cy. V.1 Thr A.p Glu Arg 

910 
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AAT GCA CTT AGA AGA AGC TCT OCT CCC CAT ACA CAT TCA AAC ACT TAC 2838 
Asn Ala Leu Arg Arg Ser Ser Ala Ala Hie Thr His Ser Asn Thr Tyr 
920 925 930 935 

AAT TTC ACT AAG TCG GAA AAT TCA AAT AGG ACA TGT TCT ATG CCT TAT 2886 
Aon Phe Thr Lys Ser Glu Asn S r Asn Arg Thr Cys Ser Met Pro Tyr 
940 945 950 

GCC AAA TTA GAA TAC AAG AGA TCT TCA AAT GAT AGT TTA AAT ACT GTC 2934 
Ala Lye Leu Glu Tyr Lys Arg Ser Ser Asn Aep Ser Leu Asn Ser Val 
955 960 965 

AGT^AGT AAT GAT GGT TAT GGT AAA AGA GGT CAA ATG AAA CCC TCG ATT 2982 
Ser Ser Asn Asp Gly Tyr Gly Lys Arg Gly Gin Met Lys Pro Ser He 
970 975 980 



3030 



GAA TCC TAT TCT GAA GAT GAT GAA AGT AAG TTT TGC AGT TAT GGT CAA 
Glu Ser Tyr Ser Glu Asp Aep Glu Ser Lys Phe Cys Ser Tyr Gly Gin 
985 990 995 

TAC CCA GCC GAC CTA GCC CAT AAA ATA CAT AGT GCA AAT CAT ATG CAT 3078 
Tyr Pro Ala Asp Leu Ala His Lye He His Ser Ala Asn Hie Met Asp 
1000 1005 1010 1015 

GAT AAT GAT GGA GAA CTA GAT ACA CCA ATA AAT TAT AGT CTT AAA TAT 3126 
Asp Asn Asp Gly Glu Leu Asp Thr Pro He Asn Tyr Ser Leu Lys Tyr 
1020 1025 1030 

TCA GAT GAG CAG TTG AAC TCT GGA AGG CAA AGT CCT TCA CAG AAT GAA 3174 
Ser Asp Glu Gin Leu Asn Ser Gly Arg Gin Ser Pro Ser Gin Asn Glu 
1035 1040 1045 

AGA TGG GCA AGA CCC AAA CAC ATA ATA GAA GAT GAA ATA AAA CAA AGT 3222 
Arg Trp Ala Arg Pro Lys His He He Glu Asp Glu He Lys Gin Ser 
1050 1055 1060 

GAG CAA AGA CAA TCA AGG AAT CAA AGT ACA ACT TAT CCT GTT TAT ACT 3270 
Glu Gin Arg Gin Ser Arg Asn Gin Ser Thr Thr Tyr Pro Val Tyr Thr 
1065 1070 1075 

GAG AGC ACT GAT GAT AAA CAC CTC AAG TTC CAA CCA CAT TTT GGA CAG 3318 
Glu Ser Thr Asp Asp Lys His Leu Lys Phe Gin Pro His Phe Gly Gin 
1080 1085 1090 1095 

CAG GAA TGT GTT TCT CCA TAC AGG TCA CGG GGA GCC AAT GGT TCA GAA 3366 
Gin Glu Cys Vel Ser Pro Tyr Arg Ser Arg Gly Ala Asn Gly Ser Glu 
1100 H05 1H0 

ACA AAT CGA GTG GGT TCT AAT CAT GGA ATT AAT CAA AAT GTA AGC CAG 3414 
Thr Asn Arg Val Gly Ser Asn His Gly He Asn Gin Asn Val Ser Gin 
1115 1120 1125 

TCT TTG TGT CAA GAA GAT GAC TAT GAA GAT GAT AAG CCT ACC AAT TAT 3462 
Ser Leu Cys Gin Glu Asp Asp Tyr Glu Asp Asp Lys Pro Thr Asn Tyr 
1130 1135 H40 

ACT GAA CGT TAC TCT GAA GAA GAA CAG CAT CAA GAA GAA GAG AGA CCA 3510 
Ser Glu Arg Tyr Ser Glu Glu Glu Gin His Glu Glu Glu Glu Arg Pro 
1145 1150 1155 



1305 



r.TO ACC GAA CTT CCX OCA OTC TCA CM CXC CCT AOA ACC AAA ICC AOC 
5S sir SI SIT Pr? AlaVal Ser Gin Hi. ProArg Thr Ly. S«r S^ 

S SS SS SJ sS q S« L« S«: S« OlnSer Al. Axg Hi. Ly^Ala 



w e» TCA GGA GOO AAA TCI CCC TCC AAA AOT C« OCT CM 
SSISZp^SotJSaLy. S« Pro Ser Ly. S~ Oly Al. Oln 

1355 13TO ajto 

&** act CCA CCT GAA CAC TAT GTT CAO OAO ACC CCA CTC ATG 
JS £S J£ IS S£ Si Hi- Tyr Val Oln Oln Thr Pro L.« Met 
1370 1375 " ou 

A«e ASA TOT ACT TCT OTC ACT TCA CTT OAT AGT TTT 6A0 AST 06T 
IS IS C?I iS S Valuer S~l**»P SerPh. 01- S.r Arg 
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aca AAT TAT ACC ATA AAA TAT AAT GAA GAG AAA CGT CAT GTG GAT CAG 3558 
t£ Iln Tyr ser ?le if Tyr A.n Gl« Glu Ly. Arg HI. v.l A.p Gin 

1160 « 70 1175 

rnT ATT CAT TAX act TTA AAA TAT CCC ACA GAT ATT CCT TCA TCA CAG 3606 
£2 S S£ ™ £ SI ly. Tyr Al. Thr A.p 11. Pro S.r Ser Oln 

1180 1185 *x»v 

AAA CAG TCA TTT TCA TTC TCA AAO AOT TCA TCT GGA CAA AGC AOT AAA 36S4 
£S SS XI £ Phe ser Ly. Ser Ser S.r Gly Oln Ser Ser Ly. 
* 1195 *200 1205 

ACC GAA CAT ATG TCT TCA ACC ACT GAG AAT ACO TCC ACA CCT TCA TCT 3702 
Tte XI Hi. Met IS S.r Ser Ser Olu A.n Thr Ser Thr Pro Ser Ser 
1210 

AAT GCC AAO AGO CAO AAT CAO CTC CAT CCA ACT TCT OCA CAO AOT ACA 3750 
Jin SfS i£ £g gS A.n Oln Leu Hi. Pro Ser Ser Ale Oln Ser Arg 
1225 I 230 

ACT GGT CAG CCT CAA AAO OCT GCC ACT TGC AAA GTT TCT TCT ATT AAC 3798 

SJ XS £S SI Ly. Ala Al. Thr Cy. Ly. Val Ser Ser 11. A.n 
1240 1245 "50 1255 

CAA GAA ACA ATA CAG ACT TAT TCT GTA GAA GAT » »» ™ ™ 

1260 

TCA AGA TOT AGT TCA TTA TCA TCT TTC TCA TCA OCT GAA GAT GAA ATA 3894 

Ser £ 
1275 

GGA TGT AAT CAG ACO ACA CM CAA OCA OAT TCT GCT AAT ACC CTG CAA 
A8n C 
1290 



CAA GAA ACA ATA CAO ACT TAT TOT WiA ma w*a «w ~« 

SI SI Thr II. Gin Thr Tyr Cy. Val GluKmp Thr Pro II. Cy^Phe 

TCA AGA TOT AGT TCA TTA TCA TUX TTC TCA TCA GCT GAA OAT GAA A~» 
IS Irg* S> ??L S « r 1-0 S-r So 8 " IMS 

cca TGT AAT CAG ACO ACA CAG GAA OCA CAT TCT GCT AAT ACC t*~ ™» 
SJ 5 52 SJ i5 Thr Oln Ota.au Ser Ala A.n Thr L« Oln 

1290 1295 
»<» c« caa ATA AAA GGA AAO ATT CCA ACT ACG TCA CCT CAA CAT CCT 3990 

S S2 S t£ SJ Sj o "* *** l^* 1 * Glu *■» pro 



3942 



4038 



4086 



4134 



4182 



4230 



1385 
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TCG ATT CCC ACC TCC CTT CAG ACT CAA CCA TCC ACT CCA ATC CTA AGT 
Ser lie Ala Ser Ser Val Cln Ser Clu Pro Cys Ser Gly Met Val Ser 
1400 1405 1410 1415 

CCC ATT ATA ACC CCC ACT CAT CTT CCA CAT ACC CCT CCA CAA ACC ATC 4326 
Cly 11 He Ser Pro Ser Asp Leu Pr Asp Ser Pro Cly Cln Thr Met 
1420 1425 1430 

CCA CCA ACC AGA AGT AAA ACA CCT CCA CCA CCT CCT CAA ACA CCT CAA 4374 
Pro Pro Ser Arg Ser Lys Thr Pro Pro Pro Pro Pro Cln Thr Ala Gl* 
1435 1440 1445 

ACC AAG CCA GAi* CTA CCT AAA AAT AAA OCA CCT ACT CCT CAA AAC AGA 4422 
Thr Lys Arg Clu Val Pro Lys Asn Lys Ala Pro Thr Ala Clu Lys Arg 
1450 1455 1460 

GAG AGT GCA CCT AAG CAA CCT CCA CTA AAT CCT CCA CTT CAC AGO CTC 4470 
Clu Ser Gly Pro Lys Cln Ala Ala Val Asn Ala Ala Val Cln Arg Val 
1465 1470 1475 

CAC GTT CTT CCA CAT CCT GAT ACT TTA TTA CAT TTT CCC ACA CAA AGT 4518 
Gin Val Leu Pro Asp Ala Asp Thr Leu Leu His Phe Ala Thr Clu Ser 
1480 1485 1490 1495 

ACT CCA CAT GGA TTT TCT TGT TCA TCC ACC CTC AGT CCT CTC ACC CTC 4566 
Thr Pro Asp Gly Phe Ser Cys Ser Ser Ser Lsu Ser Ala Leu Ser Leu 
1500 1505 1510 

CAT GAG CCA TTT ATA CAG AAA CAT CTG GAA TTA AGA ATA ATC CCT CCA 4614 
Asp Glu Pro Phe He Cln Lys Asp Val Clu Leu Arg Ha Met Pro Pro 
* 1515 1520 1525 

GTT CAG GAA AAT GAC AAT GCC AAT GAA ACA CAA TCA CAG CAC CCT AAA 4662 
Val Gin Glu Asn Asp Asn Gly Asn Clu Thr Clu Ser Glu Cln Pro Lys 
1530 1535 1540 

GAA TCA AAT GAA AAC CAA CAC AAA GAG CCA CAA AAA ACT ATT CAT TCT 4710 
Glu Ser Asn Glu Asn Gin Glu Lys Glu Ala Clu Lys Thr He Asp Ser 
1545 1550 1555 

GAA AAG GAC CTA TTA CAT CAT TCA GAT OAT CAT CAT ATT GAA ATA CTA 4758 
Glu Lys Asp Leu Leu Asp Asp Ser Asp Asp Asp Asp He Glu He Leu 

1560 1565 1570 1575 

GAA CAA TGT ATT ATT TCT CCC ATC CCA ACA AAC TCA TCA CCT AAA GGC 
Clu Glu Cys lis lis ser Ala Met Pro Thr Lys Ser Ser Arg Lys Gly 
J 1580 1585 1590 

AAA AAG CCA GCC CAG ACT CCT TCA AAA TTA CCT CCA CCT GTO GCA ACC 
Lys Lys Pro Ala Gin Thr Ala Ser Lys Leu Pro Pro Pro Val Ala Arg 
1595 1600 1605 

AAA CCA AGT CAG CTC CCT CTC TAC AAA CTT CTA CCA TCA CAA AAC AGG 
Lvs Pro Ser Gin Leu Pro Val Tyr Lys Leu Leu Pro Ser Gin Asn Arg 
1 1610 "IS 1620 



4806 



4854 



4902 



4950 



WO 92/13103 



-54- 



PCT/US92/00376 



TAT TGT GTT GAA GOG ACA CCT ATA AAC TTT TCC ACA OCT ACA 
fS S? 2 S SS.1T Thr Pro lie A.H*. S.r Thr AX. Tjr 

1640 1645 

m _ _ m ATC GAA TCC CCT CCA AAT GAG TTA GCT GCT 

IS Si S S S S S a S f ~ «• — «■ — 



1660 



a « a 55 a S S f£= ™ S » » - 



mf __ cor AGA ACT ACA GAT GAG GCT CAA GGA GGA 

S S g S S S Si W A-P «« »U«. <=!, «y 
1690 1695 

- sis s s s s £ a ss s $ a a 
a ss s £ ass & 

1720 1725 

— -*r. mar ee* TTC CGT GTG AAA AAG ATA ATG GAC CAG CTC CAG 
^ 2 2 ™ 22 S 2 Ly. Ly.n. H~ Mp Gin ValGln 



™» r« TOT CCG TCO TCT TCT CCA CCC AAC AAA AAT CAG TTA GAT OCT 
g22S22l«ll. Pro A.n Ly. A-n Gin L-« A.P Gly 

1755 17oU 



1770 



GAA TAT AGG ACA COT GTA AGA AAA AAT GCA GAC TCA AAA AAT AAT TTA 
Oltt Tyr Arg Thr Arg Val Arg Ly. A.» Al. A-p S«r Ly. A.n A-n Lma 

1785 17*0 " 9S 

imw fiats AGA GTT TTC TCA GAC AAC AAA GAT TCA AAG AAA CAG AAT 
2 S 2 SrS SS 22 *-P *» Ly. ^l. «r> <"» j- 

1800 1805 AWW 

sgisssssss^sissaaa 



1835 



~~ *** ru CCA ACT CCT TAC TGT TTT TCA CGA AAT CAT TCT TTO AGT 
CCT ATT CAA GGA ACT cm aj»w * 



S S S5 a S K £ Sl* = "* " IS. 8 " s " 

rrx CAT TTT GAT GAT GAT GAT GTT GAC CTT TCC AGO GAA AAG GCT 

2 2 2 2 2 S ™ *•* f clu Ala 



5094 



S142 



5190 



5238 



5286 



5334 



5382 



5430 



5478 



5526 



5574 



5622 



5670 
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GAA TTA AGA AAG GCA AAA GAA AAT AAG GAA TCA GAG OCT AAA GTT ACC 5718 
Glu Leu Arg Lys Ala Lys Glu Asn Lys Glu Ser Glu Ala Lys Val Thr 
1880 1885 1890 1895 

AGO CAC ACA GAA CTA ACC TCC AAC CAA CAA TCA OCT AAT AAG AGA GAA 5766 
Ser His Thr Glu Leu Thr Ser Asn Gin Gin Ser Ala Aan Lys Thr Gin 
1900 1905 1910 

GCT ATT GCA AAG GAG CCA ATA AAT CGA GGT CAG CCT AAA CCC ATA CTT 5814 
Ala lie Ala Lya Gin Pro He Asn Arg Gly Gin Pro Lys Pro Zle Leu 
1915 1920 1925 

CAG AAA CAA TCC ACT TTT CCC CAG TCA TC AAA GAC ATA CCA GAC AGA 5862 
Gin Lys Gin Ser Thr Phe Pro Gin Ser Ser Lys Asp He Pro Asp Arg 
1930 1935 1940 

GGG GCA GCA ACT GAT GAA AAG TTA CAG AAT TTT GCT ATT GAA AAT ACT 5910 
Gly Ala Ala Thr Asp Glu Lys Leu Gin Asn Phe Ala He Glu Asn Thr 
1945 1950 1955 

CCA GTT TGC TTT TCT CAT AAT TCC TCT CTG AGT TCT CTC AGT GAC ATT 5958 
Pro Val Cys Phe Ser His Asn Ser Ser Leu Ser Ser Leu Ser Asp He 
1960 1965 1970 1975 

GAC GAA GAA AAC AAC AAT AAA GAA AAT GAA CCT ATC AAA GAG ACT GAG 6006 
Asp Gin Glu Asn Asn Asn Lys Glu Asn Glu Pro Zle Lys Glu Thr Glu 
1980 1985 1990 

CCC CCT GAC TCA GAG GGA GAA CGA AGT AAA CCT GAA GCA TCA GGC TAT 6054 
Pro Pro Asp Ser Gin Gly Glu Pro Ser Lys Pro Gin Ala Ser Gly Tyr 
1995 2000 2005 

GCT CCT AAA TCA TTT CAT GTT GAA GAT ACC CGA GTT TGT TTC TCA AGA 6102 
Ala Pro Lys Ser Phe His Val Glu Asp Thr Pro Val Cys Phe Ser Arg 
2010 2015 2020 

AAC AGT TCT CTC AGT TCT CTT AGT ATT GAC TCT GAA GAT GAC CTG TTG 6150 
Asn Ser Ser Leu Ser Ser Leu Ser He Asp Ser Glu Asp Asp Leu Leu 

2025 2030 2035 

CAG GAA TGT ATA AGC TCC GCA ATG CCA AAA AAG AAA AAG CCT TCA AGA 6198 
Gin Glu Cys He Ser Ser Ala Met Pro Lys Lys Lys Lys Pro Ser Arg 
2040 2045 2050 2055 

CTC AAG GGT GAT AAT GAA AAA GAT AGT CCC AGA AAT ATG GGT GGC ATA 6246 
Leu Lys Gly Asp Asn Glu Lys His Ser Pro Arg Asn Met Gly Gly He 
2060 2065 2070 

TTA GGT GAA GAT CTG ACA CTT GAT TTG AAA GAT ATA GAG AGA CCA GAT 6294 
Leu Gly Glu Asp Leu Thr Leu Asp Leu Lye Asp He Gin Arg Pro Asp 
2075 2080 2085 

TCA GAA CAT GGT CTA TCC CCT GAT TCA GAA AAT TTT GAT TGC AAA GCT 6342 
Ser Glu His Gly Leu Ser Pro Asp Ser Glu Asn Phe Asp Trp Lys Ala 
2090 2095 2100 

ATT CAG GAA GGT GCA AAT TCC ATA GTA AGT AGT TTA CAT CAA GCT GCT 6390 
He Gin Glu Gly Ala Asn Ser He Val Ser Ser Leu His Gin Ala Ala 

2105 2110 2115 
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CCT CCT CCA TOT TTA TCI AGA CAA OCT TCO TCT CAT TCA GAT TCC ATC 
& aS a2 SI Si S« Ar, Gin Al. Ser Ser A.p Ser A.p s.r g. 
2120 2125 ™*» 

2 ^ S 5S £ = S 2 S » Si £ 



2140 2145 

lAA ccc ttt 
,y« Pro Phe 
2155 2160 

SAG AAA AGT 
Slu Lys Ser 
2170 2175 

2185 2190 

E~ - - - s. s ™ °" s HL~ - s - a. 

2220 

2235 2240 

AAA AAA GGC CCA CCC CTT AAG ACT CCA CCC TCC AAA AGC CCT AGI CAA 
J£ JJi £ S Z« XV »r Pro Al. Ser Ly. S.r Pro S«r Clu 

2250 2255 

2265 2270 2275 

2280 2285 

i>& et»A OCT TCT AGA TCA CCA TCT AGA GAT TCO ACC CCT TCA AGA 
22 J£ S S S5 S~ Gly S.r Arg An> S« Thr Pro S^rAr, 

2300 



23X5 2320 m« 

2330 2335 " w 

„ xrx -c* tcc CCT AGT ACT GCT TCA ACT AAG TCC TCA 



6438 



6486 



6534 



6582 



6630 



6678 



6726 



6774 



6822 



6870 



6918 



6966 



7014 



7062 



7110 



2345 
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GGT TCT GGA AAA ATC TCA TAT ACA TCT CCA GCT AGA CAG ATG AGC CAA 7158 
Giy Ser Gly Lye Met Ser Tyr Thr Ser Pro Gly Arg Gin Met ser Gin 
2360 2365 2370 2375 

CAG AAC CTT ACC AAA CAA ACA GOT TTA TCC AAG AAT GCC ACT ACT ATT 7206 
Gin Asn Leu Thr Lye Gin Thr Gly Leu Ser Lye Aen Ala Ser Ser He 
2380 2385 2390 

CCA AGA AGT GAG TCT GCC TCC AAA GGA CTA AAT CAG ATG AAT AAT GGT 7254 
Pro Aro Ser Glu Ser Ale Ser Lye Gly Leu Aen Gin Met Asn Aen Gly 
2395 2400 2405 

AAT GGA GCC AAT AAA AAG GTA GAA CTT TCT AGA ATG TCT TCA ACT f^AA 7302 
Aen Gly Ala Aen Lye Lye Val Glu Leu Ser Arg Met Ser Ser Thr Lye 
2410 2415 2420 

TCA AGT GCA AGT GAA TCT GAT AGA TCA GAA AGA CCT GTA TTA GTA CGC 7350 
Ser Ser Gly Ser Glu Ser Aep Arg Ser Glu Arg Pro Val Leu Val Arg 
2425 2430 2435 

CAG TCA ACT TTC ATC AAA GAA GCT CCA AGC CCA ACC TTA AGA AGA AAA 7398 
Gin Ser Thr Phe lie Lye Glu Ala Pro Ser Pro Thr Leu Arg Arg Lye 
2440 2445 2450 2455 

TTC GAG GAA TCT GCT TCA TTT GAA TCT CTT TCT CCA TCA TCT AGA CCA 7446 
Leu Glu Glu Ser Ala Ser Phe Glu Ser Leu Ser Pro Ser Ser Arg Pro 
2460 2465 2470 

GCT TCT CCC ACT AGG TCC CAG GCA CAA ACT CCA GTT TTA AGT CCT TCC 7494 
Ala Ser Pro Thr Arg Ser Gin Ala Gin Thr Pro Val Leu Ser Pro Ser 
2475 2480 2485 

CTT CCT GAT ATG TCT CTA TCC ACA CAT TCO TCT GTT CAG GCT GGT GGA 7542 
Leu Pro Aep Met Ser Leu Ser Thr Hie Ser Ser Val Gin Ala Gly Gly 
2490 2495 2500 

TGG CGA AAA CTC CCA CCT AAT CTC AGT CCC ACT ATA GAG TAT AAT GAT 7590 
Trp Aro Lye Leu Pro Pro Aen Leu Ser Pro Thr He Glu Tyr Aen Aep 
2505 2510 2515 

GGA AGA CCA GCA AAG CGC CAT GAT ATT GCA CGG TCT CAT TCT GAA AGT 7638 
Gly Arg Pro Ala Lye Arg Hie Aep He Ala Arg Ser Hie Ser Glu Ser 

2520 2525 2530 2535 

CCT TCT AGA CTT CCA ATC AAT AGG TCA GGA ACC TGG AAA CGT GAG CAC 7686 
Pro Ser Arg Leu Pro He Aen Arg Ser Gly Thr Trp Lye Arg Glu Hie 
2540 2545 2550 

AGC AAA CAT TCA TCA TCC CTT CCT CGA GTA AGC ACT TGG AGA AGA ACT 7734 
Ser Lye Hie Ser Ser Ser Leu Pro Arg Val Ser Thr Trp Arg Arg Thr 
J 2555 25«0 2565 

GGA ACT TCA TCT TCA ATT CTT TCT GCT TCA TCA GAA TCC AGT GAA AAA 7782 
Glv Ser Ser Ser Ser He Leu Ser Ala Ser Ser Glu Ser Ser Glu Lye 

J 2570 2575 2580 

GCA AAA AGT GAG GAT GAA AAA CAT CTG AAC TCT ATT TCA GGA ACC AAA 7830 
Ala Lye Ser Glu Aep Glu Lye Hie Val Aen Ser He Ser Gly Thr Lye 

2585 2590 2595 



WO 92/13103 



-58- 



PCI7US92/00376 



„_ ... .. c cb. CTA icC CCA AAA GCA ACA TOO ASA AAA ATA 

SS ^ S£ ™ ffi S 2 Ala Ly. GlyThr Trp Arg Ly. 

... ... am- GAA TTT TCT CCC ACA AAT ACT ACT TCT CAG ACC GTT TCC 

K iin Si £ £ £ Thr A.n Sar Thr Ser Cln Thr V.1 S.r 

2620 2625 

CCT ACA AAT GCT OCT GAA TCA AAC ACT CTA ATT TAT CAA ATG 

12 !g Si £ 3? Thr Leu aj. 61n Met 

2635 2640 

CCA CCT CCT GTT TCT AAA ACA GAG GAT GTT TG6 GTG AGA ATT GAG GAC 
£ £ £ £ £ Si Ar Glu A-p Val Trp Val Arg II. Glu A-p 
2650 2655 2660 

„_ «c fjn CCT AGA TCT GGA AGA TCT CCC ACA GOT AAT ACT 

SI £ S £ £ £ IS S«r Gly Arg Ser Pro Thr Gly A.n Thr 

J 2665 2670 2675 

r-rr. ATT GAC ACT CTT TCA GAA AAC GCA AAT CCA AAC ATT AAA 

£ £ SS £ S£ £ S 52 «i« ty. au *« *« J& 

2680 2685 2WU 

wa GAT AAT CAG GCA AAA CAA AAT GTG GGT AAT GGC ACT GTT 

Asp s2 Si Asp ££ Al. Ly. Gin A.nVal Gly A.n Gly S« W 

GTG GGT TTG GAA AAT CGC CTG ACC TCC TTT ATT CAG 
£££ £ ?S SS 2- «- A-n o Arg Uu Thr S.r Ph. g Il. Gin 

^« riT ccc CCT GAC CAA AAA GGA ACT GAG ATA AAA CCA GGA CAA AAT 
SS SiJ SS 2? §£ Si GlyThr Glu II. Ly. ProOly Gin A.» 

AAT CCT GTC CCT GTA TCA GAG ACT AAT CAA ACT CCT ATA GTG GAA COT 
£ £ JS Pro V.1 S«r Glu Thr A.n Glu S« Pro II. Val Glu Arg 
2745 2750 2755 

ACC CCA TTC ACT TCT AGC AGC TCA AGC AAA CAC AGT TCA CCT AGT 066 
& £ £ £ £ 8« 8« S.r Sor Ly. Hi. o S.r S.r Pro S« Jly. 
2760 2765 

ACT GTT GCT GCC AGA GTG ACT CCT TTT AAT TAC AAC CCA AGC CCT AGO 
£ SS £ SS Ar? VI Thr Pro Ph. A« Tyr A.n Pro *ur Pro Arg 

2780 2785 * '» w 

*&* Aiac acc CCA CAT AGC ACT TCA GCT CGG CCA TCT CAG ATC CCA ACT 
t£ £ £ £ S Thr ST Al^Arg Pro St Gin lUPro Thr 

rcA c*TG AAT AAC AAC ACA AAG AAG CGA GAT TCC AAA ACT GAC AGC ACA 

2810 2815 2820 

r*A TCC ACT GGA ACC CAA AGT CCT AAG CGC CAT TCT GGG TCT TAC CTT 
£ £ £ £J £ ^ Pro Ly. Arg Hi. St Gly St Tyr Leu 
2825 2830 2835 



7878 



7926 



7974 



80* 



8070 



8118 



8166 



8214 



8262 



8310 



6358 



6406 



8454 



8502 



8550 
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GTG ACA TCT GTT TAAAAGAGAG GAAGAATGAA ACTAAGAAAA TTCTATGTTA 8602 

Val Thr Ser Val 

2840 

ATTACAACTC CTATATAGAC ATTTTGTTTC AAATGAAACT TTAAAAGACT GAAAAATTTT 8662 

GTAAATAGGT TTGATTCTTG TTACAGGGTT TTTGTTCTGG AAGCCATATT TGATAGTATA 8722 

CTTTGTCTTC ACTGGTCTTA TTTTGGCAGG CACTCTTGAT GGTTAGGAAA AAATAGAAAG 8782 

CCAAGTATGT TTGTACAGTA TGT7TTACAT GTATTTAAAG TAGCATCCCA TCCCAACTTC 8842 

CTTAATTATT GCTTGTCTAA AATAATGAAC ACTACAGATA GGAAATATGA TATATTGCTG 8902 

TTATCAATCA TTTCTAGATT ATAAACTGAC TAAACTTACA TGAGGGGAAA ATTGGTATTT 8962 

ATGCAAAAAA AAAATGTTTT TGTCCTTGTG AGTCCATCTA ACATCATAAT TAATGATGTG 9022 

GCTGTGAAAT TCACAGTAAT ATGGTTCCCG ATGAAGAAGT TTACCCAGCC TGCTTTGCTT 9082 

ACTGCATGAA TGAAACTGAT GGTTGAATTT GAGAAGTAAT GATTAACAGT TATGTGGTGA 9142 

CATGATGTGC ATAGAGATAG CTACAGTGTA ATAATTTACA CTATTTTGTG CTCGAAACAA 9202 

AACAAAAATC TGTGTAACTG TAAAACATTG AATGAAACTA TTTTACCTGA ACTAGATTTT 9262 

ATCTGAAAGT AGGTAGAATT TTTGCTATGC TGTAATTTGT TGTATATTCT GGTATTTGAG 9322 

GTGAGATGGC TGCTCTTTAT TAATGAGACA TGAATTGTGT CTCAAGAGAA ACTAAATGAA 9382 

CATTTCAGAA TAAATTATTG CTGTATGTAA ACTGTTACTG AAATTGGTAT TT C TTT G AAC 9442 

CGTCT C TTTC ACATTTGTAT TAATTAATTG TTTAAAATGC CTCTTTTAAA AGCTTATATA 9502 

AATTTTTTCT TCAGCTTCTA TGCATTAAGA GTAAAATTCC TCTTACTGTA ATAAAAAGAT 9562 

TGAAGAAGAC TGTTGCCACT TAACCATTCC ATGCGTTGGC ACTT 9606 

(2) INFORMATION FOR SSQ ID NOi2j 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH t 2843 amino acida 

(B) TYPE i amino acid 
(D) TOPOLOGY x linaar 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION! SBQ ID NOs2i 

Met Ala Ala Ala Ser Tyr Asp Gin Leu Leu Lya Gin Val Gla Ala Leu 

15 10 15 

Lya Met Glu Aan Ser Aan Leu Arg Gin Glu Leu Glu Aap Aan Ser Asn 

20 25 30 

His Leu Thr Lya Leu Glu Thr Glu Ala Ser Aan Met Lya Glu Val Leu 
35 40 45 

Lya Gin Leu Gin Gly Ser Ila Glu Aep Glu Ala Met Ala Ser Ser Gly 

50 55 60 



i 
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Cln lie Asp Leu Leu Glu Arg Leu Lye Clu Leu Asn Leu Asp Ser Ser 

65 70 75 

Aan Phe Pro Gly Val Lye Leu Arg Ser Lye Met Ser Leu Arg Ser Tyr 



85 

< 

100 



Gly ser Arg Glu Gly Ser Val Ser Ser Arg Ser Gly Glu Cy. Ser Pro 
* - 105 Ho 



Val Pro Met Gly Ser Phe Pro Arg Arg Gly Phe Val Aan Gly Ser Arg 
115 120 125 

Glu Ser Thr Gly Tyr Leu Glu Glu Leu Glu Lya Glu Arg Ser Leu Leu 
130 135 "0 

Leu Ala Asp Leu Aap Lye Glu Glu Ly. Glu Lys Asp Trp Tyr Tyr Ala 

Gin Leu Gin Aan Leu Thr Ly. Arg He Asp Ser Leu Pro Leu Thr Glu 

155 170 

Asn Phe Ser Leu Gin Thr Asp I*u Thr Arg Arg Gin Leu Glu Tyr Glu 

180 185 
Ala Arg Gin He Arg Val Ala Met Glu Glu Gin Leu Gly Thr Cys Gin 

Asp Met Glu Lys Arg Ala Gin Arg Arg lie Ala Arg He Gin Gin He 

2io 215 220 

Glu Lys Asp He Leu Arg He Arg Gin Leu Leu Gin Ser Gin Ala Thr 
225 230 235 

Glu Ala Glu Arg Ser Ser Gin Asn Lys Bis Glu Thr Gly Ser His Asp 

245 25w 

Ala Glu Arg Gin Asn Glu Gly Gin Oly Val Gly Glu He Asn Met Ala 
260 265 

Thr Ser Gly Asn Oly Gin Oly Ser Thr Thr Arg Met Asp His Olu Thr 

275 280 285 

Ala Ser Val Leu Ser Ser Ser Ser Thr His Ser Ala Pro Arg Arg Leu 

290 295 300 

Thr ser His Leu Oly Thr Lys Val Glu Met Val Tyr Ser Leu Lea Ser 

305 310 3is 

Met Leu Oly Thr His Asp Lys Asp Asp Met Ser Arg Thr Leu Leu Ala 

Met Ser Ser Ser Gin Asp Ser Cys He Ser Met Arg Oln Ser Gly Cys 

340 345 J3U 

Leu Pro Leu Leu lis Oln Leu Leu His Oly Asn Asp Lys Asp Ser Val 

355 360 365 

Leu l*u Oly Asn Ser Arg Oly Ser Ly. Olu Ala Arg Ala Arg Ala Ser 

370 375 
Ala Ala Leu Hi. A.n He H. Hi. Ser Oln Pro Asp Asp Ly. Arg Gly 

385 
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Aro Arg Clu lie Arg Val Leu Hie Leu Leu Clu Cln lie Arg Ala Tyr 
* ¥ 405 410 415 

Cys Clu Thr Cye Trp Clu Trp Cln Clu Ala Hie Clu Pr Cly Met Asp 
420 425 430 

Gin Asp Lye Aen Pro Met Pr Ala Pro Val Glu Hie Cln lie Cye Pro 
435 440 445 

Ala Val Cye Val Leu Met Lye Leu Ser Phe Aep Clu Clu Hie Arg His 
450 455 460 

Ala Met Aen Glu Leu Gly Gly Leu Gin Ala He Ala Clu Leu Leu Cln 
465 470 475 480 

Val Aep Cye Glu Met Tyr Gly Leu Thr Aen Aep Hie Tyr Ser He Thr 
485 490 495 

Leu Arg Arg Tyr Ala Gly Met Ala Leu Thr Aen Leu Thr Phe Gly Asp 

500 505 510 

Val Ala Aen Lye Ala Thr Leu Cye Ser Met Lye Gly Cye Met Arg Ala 

515 520 525 

Leu Val Ala Gin Leu Lye Ser Glu Ser Glu Aep Leu Gin Gin Val He 

530 S35 540 

Ala Ser Val Leu Arg Aen Leu Ser Trp Arg Ala Aep Val Aen Ser Lys 

545 550 555 560 

Lvs Thr Leu Arg Glu Val Gly Ser Val Lye Ala Leu Met Glu Cye Ala 

1 565 570 575 

Leu Glu Val Lye Lye Glu Ser Thr Leu Lye Ser Val Leu Ser Ala Leu 
580 565 590 

Trp Aen Leu Ser Ala Hie Cye Thr Glu Aen Lye Ala Aep He Cye Ala 

595 600 605 

Val Aep Gly Ala Leu Ala Phe Leu Val Gly Thr Leu Thr Tyr Arg Ser 
610 615 820 

Gin Thr Aen Thr Leu Ala He He Glu Ser Gly Gly Gly He Leu Arg 
625 €30 635 640 

Aen Val Ser Ser Leu He Ala Thr Aen Clu Aep Hie Arg Gin He Leu 

645 650 655 

Arg Glu Aen Aen Cye Lent Gin Thr Leu Leu Gin Hie Leu Lye Ser Hie 

660 665 670 

Ser Leu Thr lie Val Ser Aen Ala Cye Gly Thr Leu Trp Aen Leu Ser 

675 680 685 

Ala Arg Aen Pro Lye Aep Gin Glu Ala Leu Trp Aep Met Cly Ala Val 

690 695 700 

Ser Met Leu Lys Aen Leu He Hie Ser Lye Hie Lye Met He Ala Met 

70S 710 715 720 

Glv Ser Ala Ala Ala Leu Arg Aan Leu Met Ala Aen Arg Pro Ala Lye 

1 725 730 735 
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Tyr Lys Asp Ala Asn lie Met Ser Pr^ Cly Ser Ser Leu Pro Ser Leu 

His Val Arg Lye Gin Lye Ala Leu Clu Ala Glu Leu Asp Ala Gin Hie 

755 760 765 

Leu ser lu Thr Phe Aap Asn lie Aep Asn Leu Ser Pro Lys Ala Ser 
770 775 780 

His Arg Ser Lye Gin Arg Hie Lys Gin Ser Leu Tyr Gly Asp Tyr Val 
785 790 795 800 

Phe Asp Thr Asn Arg His Asp Asp Asn Arg Ser Asp Asn Phe Asn Thr 
80S 610 815 

Glv Asn Met Thr Val Leu Ser Pro Tyr Leu Asn Thr Thr Val Leu Pro 
x 820 825 830 

Ser Ser Ser Ser Ser Arg Gly Ser Leu Asp Ser Ser Arg Ser Glu Lys 
835 840 845 

Asd Arg Ser Leu Glu Arg Glu Arg Gly He Gly Leu Gly Asn Tyr His 
* 850 855 860 

Pro Ala Thr Glu Asn Pro Gly Thr Ser Ser Lys Arg Gly Leu Gin lie 
865 870 875 880 

Ser Thr Thr Ala Ala Gin He Ala Lys Val Met Glu Glu Val Ser Ala 
885 890 895 

He His Thr Ser Gin Glu Asp Arg Ser Ser Gly Ser Thr Thr Glu Leu 
900 905 910 

His Cys Val Thr Asp Glu Arg Asn Ala Leu Arg Arg Ser Ser Ala Ala 

915 920 925 

His Thr His Ser Asn Thr Tyr Asn Phe Thr Lys Ser Glu Asn Ser Asn 
930 935 940 

Arg Thr Cys Ser Met Pro Tyr Ala Lys Leu Glu Tyr Lys Arg Ser Ser 
945 950 95S 960 

Asn Asp Ser Leu Asn Ser Val Ser Ser Asn Asp Gly Tyr Gly Lys Arg 

955 970 975 

Gly Gin Met Lys Pro Ser lie Glu Ser Tyr Ser Glu Asp Asp Clu Ser 
980 985 990 

Lys Phe Cys Ser Tyr Gly Gin Tyr Pro Ala Asp Leu Ala His Lys He 

995 1000 1005 

His Ser Ala Asn His Met Asp Asp Asn Asp Gly Glu Leu Asp Thr Pro 

1010 1015 1020 

He Asn Tyr Ser Leu Lys Tyr Ser Asp Glu Gin Leu Asn Ser Gly Arg 
1025 1030 1035 1040 

Gin Ser Pro Ser Gin Asn Glu Arg Trp Ala Arg Pro Lys His He lie 
1045 1050 1055 

Clu Asp Glu He Lys Gin Ser Glu Gin Arg Gin Ser Arg Asn Gin Ser 
1060 1065 1070 
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Thr Thr Tyr Pro Val Tyr Thr Clu Ser Thr Aap Asp Lye Hie Leu Lye 

1075 1080 1085 

Phe Gin Pro Hie Phe Gly Gin Gin Glu Cye Val Ser Pro Tyr Arg Ser 
1090 1095 1100 

Arg Gly Ala Aen Gly Ser Glu Thr Aen Arg Val Gly Ser Aen Hie Gly 
1105 1110 1115 H20 

lie Aen Gin Aen Val Ser Gin Ser Leu Cye Gin Glu Aap Aep Tyr Glu 
1125 1130 1135 

Aep Aep Lye Pro Thr Aen Tyr Ser Glu Arg Tyr Ser Glu Glu Glu Gin 
1140 1145 1150 

Hie Glu Glu Glu Glu Arg Pro Thr Aen Tyr Ser lie Lye Tyr Aen Glu 
1155 1160 1165 

Glu Lye Arg Hie Val Aep Gin Pro He Aep Tyr Ser Leu Lye Tyr Ala 
1170 1175 1180 

Thr Aep He Pro Ser Ser Gin Lye Gin Ser Phe Ser Phe Ser Lye Ser 
1185 1190 1195 1200 

Ser Ser Gly Gin Ser Ser Lye Thr Glu Hie Met Ser Ser Ser Ser Glu 
1205 - 1210 1215 

Aen Thr Ser Thr Pro Ser Ser Aen Ala Lye Arg Gin Aen Gin Leu Hie 
1220 1225 1230 

Pro Ser Ser Ala Gin Ser Arg Ser Gly Gin Pro Gin Lye Ala Ala Thr 
1235 1240 1245 

Cye Lye Val Ser Ser He Aen Gin Glu Thr He Gin Thr Tyr Cye Val 
1250 1255 1260 

Glu Aep Thr Pro He Cye Phe Ser Arg Cye Ser Ser Leu Ser Ser Leu 
1265 1270 1275 1280 

Ser Ser Ala Glu Aep Glu lie Gly Cye Aen Gin Thr Thr Gin Glu Ala 
1285 1290 1295 

Aep Ser Ale Aen Thr Leu Gin He Ale Glu He Lye Gly Lye He Gly 
1300 1305 1310 

Thr Arg Ser Ala Glu Aep Pro Val Ser Glu Val Pro Ala Val Ser Gin 

1315 1320 1325 

Hie Pro Arg Thr Lye Ser Ser Arg Leu Gin Gly Ser Ser Leu Ser Ser 
1330 1335 1340 

Glu Ser Ala Arg Hie Lye Ala Val Glu Phe Pro Ser Gly Ala Lye Ser 

1345 1350 1355 1360 

Pro Ser Lye Ser Gly Ale Gin Thr Pro Lye Ser Pro Pro Glu Hie Tyr 
1365 1370 1375 

Val Gin Glu Thr Pro Leu Net Phe Ser Arg Cye Thr Ser Val Ser Ser 
1380 1385 1390 

Leu Aep Ser Phe Glu Ser Arg Ser He Ala Ser Ser Val Gin Ser Glu 
1395 1400 1405 
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Pro eye s.r Gly Met V.l S.r Oly tie II. Ser Pro ser A.p Leu Pro 

1410 1415 

Asp ser Pro ly Gin Thr Met Pr Pro ser Arg Ser Ly. Thr Pro Pro Q 
142S 1430 

Pro Pro Pro Gin Th^Ala Cla Thr Ly. ArgOlu Val Pro Ly. A^Ly. 
Ala Pro Thr Ala Clu Ly. Arg Olu Ser Gly Pro Ly. Gin Ala Ala Val 

1460 1465 " ,w 

A.n Ala Ala s val Gin Arg Val GlnVal Leu Pro A.p AlaA.p Thr Leu 

Leu Hi. Phe Ala Thr Glu Ser Thr Pro A.p Gly Phe Ser Cy. Ser Ser 

1490 1495 
Ser Leu Ser Ala Leu SerLeu A.p Clu Pro Phe.Il. Gin Ly. A.p Va^ 

Glu Leu Arg II. Met Pro Pro Val Gin GluA.n A.p A.n Gly MnOlu 

1525 

Thr Glu ser Glu Oln Pro Ly. Glu Ser A.n Glu A.n Gin Glu Ly. Glu 

1540 154S A33W 

Ala Glu Ly. Thr lie A. P Ser Olu Ly. A*> Leu Leu A.p A.p Ser Aep 

1555 1560 «w 

Asp Asp Asp lie Olu He Leu Clu Glu Cy. He lie Ser Ala Met Pro 

1570 1575 
Thr Ly. ser Ser Arg Ly. Gly Ly. Ly. Pro Ala Gin Thr Ala Ser Ly^ 

1585 1590 

Leu Pro Pro Pro Val g Ala Arg Ly. Pro SerCln Leu Pro Val Tyr^y. 

Leu Leu Pro Ser Gin A.n Arg Leu Gin Pro Gin Ly. Hi. Vjl Ser Ph. 

1620 1625 WV 

Thr Pro Gly A.p A*> Met Pro ArgVal Tyr Cy. Val OluWy Thr Pro 



1635 

H. A.n Phe Ser Thr Ala Thr Ser Leu Ser A^ Leu Thr II. Glu Ser 
1650 1655 



Pro Pro A.n Glu Lmx Ala Ala Gly Glu Gly Val Ar, Oly Oly Ala Cln^ 
1665 1670 xw 

ser Gly Glu Phe OluLy. Arg A.p Thr Il^Pro Thr Olu Gly Arg^Ser 
Thr A*> Glu Ala Gin Oly Oly Ly. Thr Ser Ser V.l Thr Il^Pro Glu 



1700 



Leu Asp Asp A.n Ly. Ala Olu Olu Gly M» II. Leu Ala Glu Cy. II. 

^715 17etU 

A.n S.r Ala Met Pro Ly. Oly Ly. S.r Hi. Ly. Pro Ph. Arg Val Ly. 

1730 1735 
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Lye lie Net Asp Gin Val Gin Gin Ale Ser Ale Ser Ser Ser Ale Pro 
1745 1750 1755 1760 

Asn Lys Asn Gin Leu Asp Gly Lys Lys Lys Lys Pr Thr Ser Pr Val 
1765 1770 1775 

Lys Pro He Pro Gin Asn Thr Glu Tyr Arg Thr Arg Vel Arg Lys Asn 
1780 1785 1790 

Ala Asp Ser Lys Asn Asn Leu Asn Ala Glu Arg Val Phe Ser Asp Asn 
1795 1800 1805 

Lys Asp Ser Lys Lys ^ln Asn Leu Lye Asn Asn Ser Lys Asp Phe Asn 
1810 1815 1820 

Asp Lys Leu Pro Asn Asn Glu Asp Arg Val Arg Gly Ser Phe Ala Phe 
1825 1830 1835 1840 

Asp Ser Pro His His Tyr Thr Pro He Glu Gly Thr Pro Tyr Cys Phe 
1845 1850 1855 

Ser Arg Asn Asp Ser Leu Ser Ser Leu Asp Phe Asp Asp Asp Asp Val 

1860 1865 1870 

Asp Leu Ser Arg Glu Lys Ala Glu Leu Arg Lys Ala Lys Glu Asn Lys 
1875 1880 1885 

Glu Ser Glu Ala Lys Val Thr Ser His Thr Glu Leu Thr Ser Asn Gin 
1890 1895 1900 

Gin Ser Ala Asn Lys Thr Gin Ala Zle Ala Lys Gin Pro He Asn Arg 
1905 1910 1915 1920 

Gly Gin Pro Lys Pro He Leu Gin Lys Gin Ser Thr Phe Pro Gin Ser 
1925 1930 1935 

Ser Lys Asp He Pro Asp Arg Gly Ala Ala Thr Asp Glu Lys Leu Gin 
1940 1945 1950 

Asn Phe Ala He Glu Asn Thr Pro Val Cys Phe Ser His Asn Ser Ser 
1955 1960 1965 

Leu Ser Ser Leu Ser Asp He Asp Gin Glu Asn Asn Asn Lys Glu Asn 
1970 1975 1980 

Glu Pro He Lys Glu Thr Glu Pro Pro Asp Ser Gin Gly Glu Pro Ser 
1985 1990 1995 2000 

Lys Pro Gin Ala Ser Gly Tyr Ala Pro Lys Ser Phe His Val Glu Asp 

2005 2010 2015 

Thr Pro Val Cys Phe Ser Arg Asn Ser Ser Leu Ser Ser Leu Ser He 

2020 2025 2030 

Asp Ser Glu Asp Asp Leu Leu Gin Glu Cys He Ser Ser Ala Met Pro 
2035 2040 2045 

Lys Lys Lys Lys Pro Ser Arg Leu Lys Gly Asp Asn Glu Lys Hie Ser 

2050 2055 2060 

Pro Arg Asn Met Gly Gly He Leu Gly Glu Asp Leu Thr Leu Asp Leu 
2065 2070 2075 2080 
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Ly. A.p II. Oln Ar^Pro A.p Ser Cl« Hi^Gly Leu Ser Pro A.P.S r 

Glu A.n Phe A.p Trp Ly. Ala II. Gin Glu Gly Al. A.n St II. Val 

2100 2105 ** 1W 

Ser Ser Leu Hi. Gin Ala Ala Ala Ala Ala Cy. Leu St Arg Gin Al. 

2115 2120 
ser ser A.p Ser Asp Ser Ile^eu ser Leu Ly. Ser Gly lie Ser Leu 

2130 2135 
Gly ser Pro Phe Hi. Leu Thr Pro W Gin Glu Glu Ly. Pro Ph. Thr^ 
2145 21S0 

S.r A.n Ly. Gly ProArg He I*u Ly. ProGly Glu Ly. Ser ThrLeu 

Glu Thr Ly. Ly. He Glu Ser Glu Ser Ly. Gly II. Ly. Gly Gly Ly. 

2180 2185 
Ly. Val TyrLy. Ser Leu lie ThrGly Ly. Val Arg Ser^n Ser Glu 

lie ser Gly Gin Met Ly. Gin Pro Leu Oln Ala A.n Met Pro Ser II. 
2210 221S «zo 

S.r Arg Gly Arg Thr Met lie Hi. lie Pro Gly Val Arg A.n Ser te' 

2225 2230 



2245 2250 
Ala Ser Ly. Ser Pro Ser Glu Gly Gin Thr Ala Thr Thr Ser Pro Arg 



Ser Ser Thr Ser Pr^Val Ser Ly. Ly. OlyPro Pro Leu Ly. Thr^Pro 

Gin Thr Ala Thr Thr Ser I 
2260 2265 2270 

Gly Ala Ly. Pro Ser Val Ly. ser Glu Leu Ser Pro Val Ala Arg Gin 

2275 2280 **° 9 

Thr Ser Gin lie Gly Gly Ser Ser Ly. Ala Pro Ser Arg St Gly Ser 
2290 2295 2300 

Arg A.p ser Thr Pro Ser Arg Pro Ala Gin Gin Pro L~ Ser Arg PrO Q 

2305 2310 

lie Gin Ser Pro OlyArg A«. Ser II. SerPro Gly Arg A.n Gl^Il. 

Ser Pro Pro A-n Ly. I-u S.r Gin Leu Pro Arg Thr Ser Ser Pro Ser 
2340 2345 

Thr Al. ST Thr Ly. St Ser Gly Ser Gly Ly. Met St Tyr Thr St 

2355 23ou wga 

Pro Gly Arg Gin Met Ser Gin Gin A.n I*u Thr Ly. Gin Thr Gly L«, 
2370 2375 <"«»" 

ser Ly. Asn Al. St Ser lie Pro Arg S.r Glu St Ala St Ly. 61^ 

2385 2390 

Leu A.n Gin Met A.n A.n Gly A.n Gly Ala J*n Ly. Ly. Val Glu^eu 

2405 
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Ser Arg Met ser Ser Thr Lye Ser Ser Gly Ser Glu Ser Asp Arg Ser 
2420 2425 2430 

Glu Arg Pro Val Leu Val Arg Gin Ser Thr Phe He Lye Glu Ala Pro 
2435 2440 2445 

Ser Pro Thr Leu Arg Arg Lys Leu Glu Glu Ser Ala Ser Phe Glu S r 
2450 2455 2460 

Leu Ser Pro Ser Ser Arg Pro Ala Ser Pro Thr Arg Ser Gin Ala Gin 
2465 2470 2475 2480 

Thr Pro Val Leu Ser Pro Ser Leu Pro Asp rfcit ser Leu Ser Thr Hie 
2485 2490 2495 

Ser Ser Val Gin Ala Gly Gly Trp Arg Lye Leu Pro Pro Aen Leu Ser 
2500 2505 2510 

Pro Thr He Glu Tyr Aen Asp Gly Arg Pro Ala Lys Arg Hie Aep Zle 
2515 2520 2525 

Ala Arg Ser His Ser Glu Ser Pro Ser Arg Leu Pro He Aen Arg Ser 
2530 2535 2540 

Gly Thr Trp Lys Arg Glu Hie Ser Lye Hie Ser Ser Ser Leu Pro Arg 

2545 2550 2555 2560 

Val Ser Thr Trp Arg Arg Thr Gly Ser Ser Ser Ser He Leu Ser Ala 

2565 2570 2575 

Ser Ser Glu Ser Ser Glu Lye Ala Lye Ser Glu Aep Glu Lye Hie Val 
2580 2585 2590 

Aen Ser He Ser Gly Thr Lye Gin Ser Lye Glu Aen Gin Val Ser Ala 

2595 2600 2605 

Lye Gly Thr Trp Arg Lye He Lye Glu Aen Glu Phe Ser Pro Thr Aen 
2610 2615 2620 

Ser Thr Ser Gin Thr Val Ser Ser Gly Ala Thr Aen Gly Ala Glu Ser 
2625 2630 2635 2640 

Lye Thr Leu He Tyr Gin Met Ala Pro Ala Val Ser Lye Thr Glu Aep 
2645 2650 2655 

Val Trp Val Arg He Glu Aep Cye Pro Zle Aen Aen Pro Arg Ser Gly 
2660 2665 2670 

Arg Ser Pro Thr Gly Aen Thr Pro Pro Val Zle Aep Ser Val Ser Glu 
2675 2680 2685 

Lye Ala Aen Pro Aen Zle Lye Aep Ser Lys Aep Aen Gin Ala Lye Gin 

2690 2695 2700 

Aen Val Gly Aen Gly Ser Val Pro Met Arg Thr Val Gly Leu Glu Aen 
2705 2710 2715 2720 

Arg Leu Thr Ser Phe Zle Gin Val Aep Ala Pro Aep Gin Lye Gly Thr 
2725 2730 2735 

Glu He Lye Pro Gly Gin Aen Aen Pro Val Pro Val Ser Glu Thr Aen 
2740 2745 2750 
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Glu Ser Pro lie Val Clu Arg Thr Pro Phe Ser Ser Sir Sir Ser Sar 
2755 2760 2765 

Lye Hia Ser Ser Pro Ser Gly^Thr Val Ala Ala J^* 1 Th * Pro phe 

Aan Tyr Aan Pro Ser Pro Arg ty» Ser Ser Ala Aap Ser Thr Ser Ala 
2785 2790 2795 2800 

Arg Pro Ser Gin lie Pro Thr Pro Val Aan Aan Aan Thr tye Lya Arg 
2805 2810 2815 

Aap Ser Ly. Thr Asp Ser Thr Glu Ser Ser CXy Thr Gin Ser Pro Ly. 
2820 2825 2830 

Arg Hie Ser Gly Ser Tyr Leu Val Thr Ser Val 
2835 2M0 

(2) INFORMATION FOR SEQ ID NOs3* 

M\ SEQ0ENCS CHARACTERISTICS: 

(A) LENGTHt 3172 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE i cONA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo sapiens 

(vii) IMMEDIATE SOURCE: 

(B) CLONES DP1(TB2) 

(ix) FEATURES 

(A) NAME/KEY s CDS 

(B) LOCATIONS 1..630 

(xi) SEQUENCE DESCRIPTION s SEQ ID NO: 3 1 

CCA CTC CCC GCT CCA GTC TAT CCG OCA CTA GGA ACA GCC CCG GGN GGC 
Ala Val Ala Ala Pro Val Tyr Pro Ala Leu Gly Thr Ala Pro Gly Gly 
1 5 10 15 

CJtfi ACC GTC CCC GCC ATG TCT GCG GCC ATG AGG GAG AGG TTC GAC CGG 
iS So Sa Met Ser Ala Ala Met Arc Glu Arg Phe Asp Arg 
20 25 30 

TTC CTO CAC OAO AAO AAC TGC AT6 ACT GAC CTT CTO GCC AAC CTC GAG 144 
JS 2S S. 5» Ly. Aan Cy. Net Thr Aap I— Le» Ala Ly. Lmx Glu 
35 40 45 

CCCAAAACC<WCTOAACAOCAGCTTCATCCCTCTT(^arCATCCCA 192 
£J iyi Thr Cly Val Aan Ar| Ser Phe He Ala Leu Oly Val lie Gly 



50 



CM GTG GCC TTC TAC CTC GTG TTC GCT TAT CCA CCC TCT CTC CTC TGC 
SatlX Leu Tyr Wt. Val Ph. Cly Tyr Cly Ala Ser Imu Leu Cy. 

65 70 75 ou 



48 



96 



240 
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AAC CTC ATA CCA TTT CCC TAC CCA GCC TAC ATC TCA ATT AAA OCT ATA 288 
En Leu II Cly Ph. Cly Tyr Pro Alt Tyr II* Ser Ile^ys Ala IXe 
85 90 95 

CAC ACT CCC AAC AAA GAA CAT CAT ACC CAC TCC CT ACC TAC TCG GTA 336 
Glu Ser pT A.n Lys Clu A.p A.p Thr Cln Trp Leu Thr Tyr Trp Val 
100 105 HO 

GTG TAT CCT CTC TTC ACC ATT CCT CAA TTC TTC TCT CAT ATC TTC CTC 384 
Val Tvr Civ Val Phe Ser lie Ale Clu Phe Phe Ser Asp He Phe Leu 
1 U5 120 125 

TCA TGC TTC CCC TTC TAC TAC ATC CTC AAC TCT GCC TTC CTC TTC TCC 432 
Ser Trp Phe Pro Phe Tyr Tyr Met Leu Lye Cye Cly Phe Leu Leu Trp 
130 135 HO 

TCC ATG GCC CCC ACC CCT TCT AAT CGO CCT GAA CTC CTC TAC AAC CCC 
Cye Met Ala Pro Ser Pro Ser Aan cly Ala Glu Leu Leu Tyr Lye Arg 
145 150 155 160 

ATC ATC CCT CCT TTC TTC CTC AAG CAC GAG TCC CAG ATC GAC ACT GTG 528 
He He Arg Pro Phe Phe Leu Lye Hia Glu Ser Gin Met Asp Ser Val 
165 170 175 

GTC AAG GAC CTT AAA GAC AAG TCC AAA GAG ACT CCA GAT GCC ATC ACT 576 
Val Lye Asp Leu Lye Aap Lye Ser Ly» Glu Thr Ala Asp Ala He Thr 
* 180 IBS 190 

AAA GAA GCG AAC AAA CCT ACC GTG AAT TTA CTG CCT CAA GAA AAG AAG 624 
Lys Glu Ala Lys Lys Ala Thr Val Asn Leu Leu Gly Glu Glu Lys Lye 
195 200 205 

AGC ACC TAAACCAGAC TAAACCAOAC TGGATGGAAA CTTCCTGCCC TCTCTGTACC 680 



480 



Ser Thr 

210 














TTCCTACTGG 


AGCTTGATGT 


TATATTAGGG 


ACTGTGGTAT 


AATTATTTTA 


ATAATGTTGC 


740 


CTTGGAAACA 


TTTTTGAGAT 


ATTAAAGATT 


GGAATGTGTT 


GTAAGTTTCT 


TTGCTTACTT 


800 


TTACTGTCTA 


TATATATACC 


GAGCACTTTA 


AACTTAATGC 


AGTGGGCAGT 


GTCCACGTTT 


860 


TTGGAAAATG 


TATTTTGCCT 


CTGGGTAGGA 


AAACATGTAT 


CTTGCTATCC 


TGCAGGAAAT 


920 


ATAAACTTAA 


AATAAAATTA 


TATACCCCAC 


AGGCTGTGTA 


CTTTACTGGG 


CTCTCCCTGC 


980 


ACGSATTTTC 


TCTGTAGTTA 


CATTTAGCRT 


AATCTTTATG 


CTTCTACTTC 


CTKTAATGTA 


1040 


CAATTTTATA 


TAATTCNCRA 




TCTATTTCTC 


CACATCTACA 


TATGGAAATG 


1100 


TTACTCTCTC 


ACTACANCAT 


CCATCATGCT 


CATGGGGAGG 


GAGCAGGGGA 


AGGTTGTATG 


1160 


TGTCATTTAT 


AACTTCTCTA 


CACTAACACC 


ACCTGCGAAA 


AGCTGGAGGA 


ACCATTGTGC 


1220 


TCGTGTGGTC 


TACTAAATAA 


TACTTTACCA 


AATACGTGAT 


TAATATGCAA 


GTGAACAAAG 


1280 


TGAGAAATGA 


AATCGAATGG 


AGATTGGCCT 


GGTTGTTTCC 


GTAGTATATG 


GCATATGAAT 


1340 


ACCAGGATAG 


CTTTATAAAG 


CACTTACTTA 


GTTAGTTACT 


CACTCTACTG 


ATAAATCGGG 


1400 


AAATTTACAC 


ACACACACAC 


ACACACACAC 


ACACACACAC 


ACACACACAC 


ACACACACAG 


1460 
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AGTACCCTGT AACTCTCAAI TCCCTGAAAA ACTAGTAATA 
TTTACATATT TGTCTATTGT CAAGATGCTA CANTOGAHNC 
NAG5GGAGAN ACATGTTGAT TTAGTCTTCT TTCCCAATCT 
GGMNCTTCTG RAGATTTGYC CACCTCTGAT TACATGTATG 
AACAACATGC TAATGRCGAC ACCTAGCTCT 'RAGMGCAATT 
ATARAGTMNC CCATAATCTO CTTCGCAATA GTTAAGTCAA 
CGCCTTTAAG GTCAAACACA AOAGGCTICC CTAGTTTACA 
CATTTAAATG CCCTCATCCG IATTCTTTGT GTTOATAAGC 
CTACAGANCA GTAAAGTTAA RNCGGATGTC TCCATTGATC 
CAATTTGTCT GGACTAGAAA ATCTGAOTTI TACACCATAC 
TAAACTAGAC TAAAACAAGT GTATAACTAA ACTAACAAGA 
GTATTTTTTA AGGCAAATAA AGATGATTAG CTCACCTTGA 
ATRACAATGT CTCATGATGT HAAHAATATT AAAGATATCA 
NNCTAATAXA ATATGOATCA OAGCATTTAT rPTGGGGAGG 
CATTTTATTA AACTTAAAAC TTTGTAGAAA GCAAACAAAA 
ACTTTTAGAT TAAAAAAATT TTAAGTAKCT AGGAGTATTT 
AAAGTACAGT ri ' l V lTGO T O OCAGAATGAA AATCAGCAAC 
AATCAGATTG ACAGCAIATA CAAXATATTA TCAGACAAGA 
TATTGCTCAT AATGACTTAC AGGCTAAAAN TAGNTOIAAA 
TCCAATTTTT TTTTGTTCCC TTGAGACCAA AATTTAAGTT 
OTGTAAATGT TAACAGCAGG ASAAGTTAAG AATTGAGCA© 
AATGAAATAC TGCCTICCCT AGAGTTTOAA AAACtAATTO 
ACAAOCGTTT ATTTGAATGT OAAtAOTCTT TCAAAGOTAT 
AAACAGCTTA AATTCTTCAA GAAAGAATTC CTGCAGCAOT 
TCAATCATTT GGATCAACAA CTGCTACTCT CGGGAAGACT 
AAATGAGCAC ACCCtTCACA CTCTTATCAC CTATCCTGAA 
AATAAAIAGA TGTAAAIAAA ATTGAGWTCT CATTTAAAAA 
AAATGACCTC ATGTTGTCGT TTAAACACCA ACTGCACCCA 
AMCCTATAXA IACATCTCTO TCAGTOCCCC TC 



CTGTCTTATC 


TCCTATAAAC 


1520 


CATTTCTCCT 


TTTATCTTCA 


1580 


TCTTTTTTAA 


MCCAGTTTNA 


1640 


TTCTYCTTTC 


TATCATKAGC 


1700 


CTGCGAGANT 


GABAGGNWGT 


1760 


TCTATCTTCA 


GTTTTTCTCT 


1820 


ACTCAGAGTC 


ACTTGTACTC 


1880 


TCCACAKCAC 


TACATAGTAA 


1940 


TGCCAANTOG 


NTATAGAGAG 


2000 


TCTTAAGACT 


CCTTTTGAAT 


2060 


TTAAATATCC 


AGCCAGTACA 


2120 


GNTAACAATC 


AGGTAAGATC 


2180 


ATACTAAGTG 


ACAGTATCAC 


2240 


AAAACAGTCG 


TGATTACCGG 


2300 


TTGTTCTTGG 


GAGAAAATCA 


2360 


AAATCCTTTT 


CCCATAAATA 


2420 


NTCTAGCATA 


TAGACTATAT 


2480 


TGAGGAGGTA 


CAAAAGTTAC 


2540 


AIACTATATT 


AAATTCTGAA 


2600 


AACTGTTGCT 


GGCAGTCTAA 


2660 


TTCTGTTGCA 


TGATTTCCCA 


2720 


AGCCTGTGCC 


TGGCTAGAAA 


2780 


GTAGTTACAG 


AATTCCTACC 


2840 


TATTCCCTTA 


CCTGAAGGCT 


2900 


CCTCTACTCA 


CAGCTGAAGA 


2960 


CATGTGATAC 


ACTGAATGGA 


3020 


AAACCATGTO 


CCCAATGGCA 


3080 


CTAGCACAGC 


CCATTGAGCT 


3140 
3172 



(2 J INFORMATION FOR SEQ ID NO: 4: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 2X0 amino acids 

(B) TYPE j amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Ala Val Ala Ala Pro Val Tyr Pro Ala Leu Gly Thr Ala Pro Gly Gly 
1 5 10 if 

Clu Thr Val Pro Ala Met Ser Ala Ala Met Arg Clu Arg Phe Asp Arg 

20 25 30 

Phe Leu His Glu Lye Asn Cys Met Thr Asp Leu Leu Ala Lys Leu Glu 
35 40 45 

Ala Lys Thr Gly Val Asn Arg Ser Phe He Ala Leu Gly Val He Gly 
SO 55 60 

Leu Val Ala Leu Tyr Leu Val Phe Gly Tyr Gly Ala Ser Leu Leu Cys 
65 70 75 80 

Asn Leu He Gly Phe Gly Tyr Pro Ala Tyr He Ser He Lys Ala He 
85 90 95 

Glu Ser Pro Asn Lys Glu Asp Asp Thr Gin Trp Leu Thr Tyr Trp Val 
100 105 no 

Val Tyr Gly Val Phe Ser He Ala Glu Phe Phe Ser Asp He Phe Leu 

115 120 125 

Ser Trp Phe Pro Phe Tyr Tyr Met Leu Lys Cys Gly Phe Leu Leu Trp 
130 135 140 

Cys Met Ala Pro Ser Pro Ser Asn Gly Ala Glu Leu Leu Tyr Lys Aro 
145 150 155 160 

He He Arg Pro Phe Phe Leu Lys His Glu Ser Gin Met Asp ser Val 

165 170 175 

Val Lys Asp Leu Lys Asp Lys Ser Lys Glu Thr Ala Asp Ala He Thr 

180 185 190 

Lys Glu Ala Lys Lys Ala Thr Val Asn Leu Leu Gly Glu Glu Lys Lvs 

195 200 205 

Ser Thr 

210 

(2) INFORMATION TOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 434 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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PCT/US92/00376 



(vi) ORIGINAL S URCEt 

(A) RGANISHt Homo sapiens 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: TBI 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5i 

Val Ala Pro Val Val Val Gly Ser Gly Arg Ala Pro Arg His Pro Ala 
1 5 10 15 

Pro Ala Ala Met His Pro Arg Arg Pro Asp Gly Phe Asp Gly Lsu Gly 
20 25 30 

Tyr Arg Gly Gly Ala Arg Asp Glu Gin Gly Phe Gly Gly Ala Phe Pro 
35 40 45 

Ala Arg Ser Phe Ser Thr Gly Ser Asp Leu Gly His Trp Val Thr Thr 
50 55 60 

Pro Pro Asp He Pro Gly Ser Arg Asn Leu His Trp Gly Glu Lye Ser 
65 70 75 80 

Pro Pro Tyr Gly Val Pro Thr Thr Ser Thr Pro Tyr Glu Gly Pro Thr 
85 90 95 

Glu Glu Pro Phe Ser Ser Gly Gly Gly Gly Ser Val Gin Gly Gin Ser 
100 105 HO 

Ser Glu Gin Leu Asn Arg Phe Ala Gly Phe Gly He Gly Leu Ala Ser 

115 120 125 

Leu Phe Thr Glu Asn Val Leu Ala His Pro Cys lie Val Leu Arg Arg 

130 135 140 

Gin Cys Gin Val Asn Tyr His Ala Gin His Tyr His Leu Thr Pro Phe 

145 150 155 160 

Thr Val He Asn He Met Tyr Ser Phe Asn Lys Thr Gin Gly Pro Arg 

165 170 175 

Ala Leu Trp Lys Gly Met Gly Ser Thr Phe He Val Gin Gly Val Thr 
180 185 1°0 

Leu Gly Ala Glu Gly He He Ser Glu Phe Thr Pro Leu Pro Arg Glu 
195 200 205 

Val Leu Hie Lye Trp Ser Pro Lys Gin He Gly Glu His Leu Leu Leu 

210 215 220 

Lvs ser Leu Thr Tyr Val Val Ala Met Pro Phe Tyr Ser Ala Ser Leu 

225 230 235 240 

He Glu Thr Val Gin Ser Glu He He Arg Asp Asn Thr Gly He Leu 
245 250 255 

Glu Cys Val Lys Glu Gly He Gly Arg Val He Gly Met Gly Val Pro 
260 265 270 
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His Ser Lye Arg Lou Leu Pr Leu Leu Ser Leu lie Phe Pr Thr Val 
275 280 285 

Leu His Gly Val Leu His Tyr lie He Ser Ser Vel He Gin Lye Phe 
290 295 300 

Val Leu Leu He Leu Lye Arg Lye Thr Tyr Asn Ser Hie Leu Ala Glu 

305 310 315 320 

Ser Thr Ser Pro Vel Gin Ser Met Leu Aep Ala Tyr Phe Pro Glu Leu 
325 330 335 

He Ala Asn Phe Ala Ala Ser Leu Cye Ser Aep Val He Leu Tyr Pro 
340 345 350 

Leu Glu Thr Val Leu Hie Arg Leu His He Gin Gly Thr Arg Thr He 
355 360 365 

He Asp Asn Thr Asp Leu Gly Tyr Glu Val Leu Pro He Asn Thr Gin 
370 375 380 

Tyr Glu Gly Met Arg Asp Cys He Asn Thr He Arg Gin Glu Glu Gly 
385 390 39S 400 

Val Phe Gly Phe Tyr Lys Gly Phe Gly Ala Val He He Gin Tyr Thr 
405 410 415 

Leu Hia Ala Ala Val Leu Gin He Thr Lys He He Tyr Ser Thr Leu 
420 425 430 

Leu Gin 

(2) INFORMATION FOR SSQ ZD NO161 

(i) SEQUENCE CHARACTERISTICS* 

(A) LENGTHS 185 amino acids 

(B) TYPE: amino acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TCFEs protein 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo sapiens 

(vii) IMMEDIATE SOURCES 

(B) CLONES TS-39(T82) 



(xi) SEQUENCE DESCRIPTION s SSQ ID NOs6s 

Glu Leu Arg Arg Phe Asp Arg Phe Leu Hie Glu Lys Asn Cys Met Thr 

15 10 15 

Asp Leu Leu Ala Lys Leu Glu Ala Lys Thr Gly Val Asn Arg Ser Phe 

20 25 30 

He Ala Leu Gly Val He Gly Leu Val Ala Leu Tyr Leu Val Phe Gly 
35 40 45 
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Tyr Gly Ala Ser Lau Leu Cya A.n Leu lie Cly Phe Cly Tyr Pr Ala 

5 55 *° 

Tyr lie Ser lie Lye Ala He Olu Ser Pro Aen lye lu Asp Aep Thr 

65 70 75 80 

Oln Trp Leu Thr Tyr Trp Val Val Tyr Cly Val Phe Ser lie Ala Glu 

85 90 « 

Phe Phe ser Aep lie Phe Leu Ser Trp Phe Pro Phe Tyr Tyr lie Leu 
100 1° 5 110 

Ly. Cy. Gly Phe Leu Leu Trp Cye Met Ala Pro Ser Pro Ser Aen Gly 



115 



Ala Glu Leu Leu Tyr Lye Arg He lie Arg Pro Phe Phe Leu Ly. Hie 



130 



Glu Ser Gin Met Asp Ser Val Val Ly. Aep Leu Lye Aep Lye Ala Lye 

^45 150 i55 xow 

Glu Thr Ala Aep Ala He Thr Lye Glu Ala Lye Lye Ala Thr Val Aen 

165 1/0 1*5 

Leu Leu Gly Glu Glu Lys Lye Ser Thr 
180 l flS 

(2) INFORMATION FOR SEQ ID NO:7t 

(1) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 2842 amino acids 

(B) TYPE: amino acid 

(C) STRANOEDNSSSs single 
(0) TOPOLOGY 1 linear 

(ii) MOLECULE TYPE 1 protein 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo eapiene 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: AFC 



(xi) SEQUENCE DESCRIPTION J SEQ ID NO:7: 

Met Ala Ala Ala Ser Tyr Asp Gin Leu Leu Lye Gin Val Glu Ala Leu 

1 5 10 15 

Lye Met Glu Aen Ser Aen Leu Arg Gin Glu Leu Glu Aep Aen Ser Aen 

20 25 30 

Hie Leu Thr Lye Leu Glu Thr Glu Ala Ser Aen Met Lye Glu Val Leu 

35 40 « 

Lye Gin Leu Gin Gly Ser lie Glu Asp Glu Ala Met Ala Ser Ser Gly 



50 

n 

65 



Gin lie Asp Leu Leu Glu Arg Leu Lys Glu Leu Asn Leu Aep Ser Ser 
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Asn Pbe Pro Cly Val Lys Lau Arg Ser Lys Met Ser Lau Arg Sar Tyr 
85 90 95 

Cly Ser Arg Clu Cly Ser Val Ser Smr Arg Ser Gly Olu cys Smr Pr 
100 105 110 

Val Pro Met Gly Ser Phe Pro Arg Arg ly Phe Val Aan Gly Ser Arg 
115 120 125 

Glu Ser Thr Gly Tyr Leu Glu Glu Leu Glu Lye Glu Arg Ser Leu Leu 
130 135 140 

Leu Ala Asp Leu Asp Lye Glu Glu Lys Glu Lye Asp Trp Tyr Tyr Ala 
145 150 155 160 

Gin Leu Gin Aen Leu Thr Lye Arg lie Aep Ser Leu Leu Thr Glu Aan 

165 170 175 

Phe Ser Leu Gin Thr Aep Met Thr Arg Arg Gin Leu Glu Tyr Glu Ala 
180 185 190 

Arg Gin lie Arg Val Ala Met Clu Glu Gin Leu Gly Thr Cye Gin Aep 
195 200 205 

Met Glu Lye Arg Ala Gin Arg Arg He Ala Arg Urn Gin Gin He Glu 

210 215 220 

Lve Asp He Leu Arg He Arg Gin Leu Leu Gin Ser Gin Ala Thr Glu 

225 230 235 240 

Ala Glu Arg Ser Ser Gin Aen Lye Bis Glu Thr Gly Ser ale Aep Ala 

245 250 255 

Glu Arg Gin Aen Glu Gly Gin Gly Val Gly Glu Jim Aan Mat Ala Thr 
260 265 270 

Ser Gly Aan Gly Gin Gly Ser Thr Thr Arg Met Asp His Glu Thr Ala 
275 280 285 

Sax val Leu Ser Ser Ser Ser Thr His Ser Ala Pro Arg Arg Leu Thr 
290 295 300 

Smr His Leu Gly Thr Lys Val Glu Met Val Tyr Ser Leu Leu Ser Met 

305 310 315 320 

Leu Gly Thr His Asp Lys Asp Asp Met Ser Arg Thr Leu Leu Ala Met 

325 330 335 

ser Ser Ser Gin Asp Ser Cys Urn Ser Mat Arg Gin Ser Gly Cys Leu 
340 345 350 

Pro Leu Leu Ha Gin Leu Leu His Gly Asn Asp Lys Asp Ser Val Leu 

355 360 365 

Leu Gly Asn Smr Arg Gly Ser Lye Glu Ala Arg Ala Arg Ala Ser Ala 

370 375 380 

Ala Leu His Asn Ha III His Ser Gin Pro Asp Asp Lys Arg Gly Arg 

Ara Glu lis Arg Val Leu His Leu Leu Glu Gin Ha Arg Ala Tyr Cya 
405 410 415 
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Glu Thr Cy. Trp Glu Trp Gin Glu Ala IU Gla Fro Gly Met A.p Gin 

A«p Ly. A.n Pro Met Pro Al. Pro V.1 Glu Hi. Gin II. Cy. Pro Al. 

435 440 

val cy. val Leu Met Ly. Leu Ser Phe A.p Glu Glu Hi. Arg Hi. Al. 

450 455 
Met A.n Glu Leu Giy Gly Leu Gin Ala II- Ala Glu Leu Leu Gla Val 

465 

Asp cy. Glu Met Tyr Gly Leu Thr A.n A|p Hi. Tyr Ser lie Thr Leu 

Arg Arg Tyr Ala Gly Met Ala Leu Thr Aen L«i Thr Phe Gly A.p Val 

500 505 

Ala A«x Ly. Ala Thr Leu Cy. Ser Met Ly. Gly Cy. Met Arg Al. L« 

515 520 a " 

Val Al. Gla Leu Ly. Ser Glu Ser Glu JUp Leu Gin Gin Val lie Ala 

530 535 
ser Val Leu Arg Mm Leu Ser Trp Arg Ala A«p Val A.« Ser Ly. Ly. 

Thr Leu Arg Glu Val Gly Ser Val Ly. Ala Leu Met Glu Cy. Ala I*u 

Glu Val Ly. Ly. Glu Ser Thr Leu Ly. Ser Val Leu Ser Ala Leu Trp 

580 S85 
A.a Leu Ser Ala Hi. Cy. Thr Glu An Ly. Ala Aq> lie Cy. Ala Val 

595 600 
Aap Gly Ala Lea Al* Phe Leu Val Gly Thr Leu Thr Tyr Arg Ser Gla 

610 615 
Thr Ann Thr Leu Ala II. II. Glu Ser Gly Gly Gly II. I-« Arg A« 

625 ^0 

Sac Ser Leu 

645 

Glu Aen Aen Cy. Leu Gin Thr Leu Leu Gin Hi. Leu Ly. Ser Hi. Ser 

660 665 

Leo Thr II. Val S« A.n Ala Cy. Gly Thr Ima Trp A.n Leu Ser Ala 

675 680 Q ° 3 

Arg A«a Pro Ly. An> «* >±* *£ Cly m ^ ^ 

690 695 

Met Leu Ly. A.n I*u II. Hi. Ser Ly Hi. Ly. Met II. Ala Met Gly 

70S 

Ser Ala Ala Ala Leu Arg A«n Leu Met Ala A.n Arg Pro Ala Ly. Tyr 

725 *3U 

Ly. Asp Ala A«n II. Het Ser Pro Gly Ser Ser Leu Pro Ser Leu His 



Val Ser Ser Leu lie Ala Thr Am Glu A*p Hi. Arg Gin II. Leu Arg 
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Val Arg Lys Gin Lys Ala Leu Glu Ala Glu Leu Asp Ala Gin His Leu 

755 760 765 

Ser Glu Thr Phe Asp Asn Zle Asp Asn Leu Ser Pro Lys Ala Ser His 

770 775 780 

Arg Ser Lys Gin Arg His Lys Gin Ser Leu Tyr Gly Asp Tyr Val Phe 
785 790 795 800 

Asp Thr Asn Arg His Asp Asp Asn Arg Ser Asp Asn Phe Asn Thr Gly 
805 810 815 

Asn Met Thr Val L: i Ser Pro Tyr Leu Asn Thr Thr Val Leu Pro Ser 
820 825 830 

Ser Ser Ser Ser Arg Gly Ser Leu Asp Ser Ser Arg Ser Glu Lys Asp 
835 840 845 

Arg Ser Leu Glu Arg Glu Arg Gly Zle Gly Leu Gly Asn Tyr His Pro 
850 855 860 

Ala Thr Glu Asn Pro Gly Thr Ser Ser Lys Arg Gly Leu Gin Zle Ser 
865 870 875 880 

Thr Thr Ala Ala Gin Zle Ala Lys Val Met Glu Glu Val Ser Ala Zle 
885 890 895 

His Thr Ser Gin Glu Asp Arg Ser Ser Gly Ser Thr Thr Glu Leu His 
900 905 910 

Cys Val Thr Asp Glu Arg Asn Ala Leu Arg Arg Ser Ser Ala Ala His 

915 920 925 

Thr His Ser Asn Thr Tyr Asn Phe Thr Lys Ser Glu Asn Ser Asn Arg 

930 935 940 

Thr Cys Ser Met Pro Tyr Ala Lys Leu Glu Tyr Lys Arg Ser Ser Asn 

945 950 955 960 

Asp Ser Leu Asn Ser Val Ser Ser Ser Asp Gly Tyr Gly Lys Arg Gly 
965 970 975 

Gin Met Lys Pro Ser Zle Glu Ser Tyr Ser Glu Asp Asp Glu Ser Lys 
980 985 990 

Phe Cys Ser Tyr Gly Gin Tyr Pro Ala Asp Leu Ala Bis Lys Zle His 

995 1000 1005 

Ser Ala Asn His Met Asp Asp Asn Asp Gly Glu Leu Asp Thr Pro Zle 

1010 1015 1020 

Asn Tyr Ser Leu Lys Tyr Ser Asp Glu Gin Leu Asn Ser Gly Arg Gin 

1025 1030 1035 1040 

Ser Pro Ser Gin Asn Glu Arg Trp Ala Arg Pro Lys Bis Zle Zle Glu 
1045 1050 1055 

Asp Glu Zle Lys Gin Ser Glu Gin Arg Gin Ser Arg Asn Gin Ser Thr 
1060 1065 1070 

Thr Tyr Pro Val Tyr Thr Glu Ser Thr Asp Asp Lys His Leu Lys Phe 
1075 1080 1085 
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in Pro Hi.. Ph. Gly oln In Glu cy. Val s.c Pro^yr Arg Ser Arg 
1090 1095 
Gly Al. A.n ly Ser Olu Thr A.n Arg Val Gly Ser A.n Hi. Gly lie 

nos 1110 

A.n Gin A.n v.l Ser Gin Ser tea Cy. Gin Glu A.p A.p Tyr Glu A.p 

1X25 1130 

Asp Ly. Pro Thr A.n Tyr Ser Glu Arg Tyr Ser Glu Glu Glu Gin Hi. 

1140 1145 

Olu Glu Glu Glu Arg Pro Thr Aw Tyr Ser He Ly. Tyr Aen Olu Glu 
1155 1160 *" 5 

l, y8 Arg_Hi. val Asp Gin Profile A.p Tyr Ser J™**" *** Ala Thr 



1170 

AjP s Il. Pro ser Ser GlnLy. Gin Ser Ph. Serine Ser Ly. Ser jj^ 



ser Gly Gin Ser Ser Ly. Thr Glu Hi. Metier Ser Ser Ser GlU g A.n 

Thr ser Thr ProSer Ser A.n Ala Ly^Arg Gin A«n Gin Leu^Hi. Pro 

Ser Ser Ale Gin Ser Arg Ser Gly Gin Pro Gin Ly. Ala Ale Thr Cy. 
1235 1240 

Ly. Val Ser Ser II. A.n JJ« 5 6ltt 11- 01n JJfo*^ *** ^ 6lU 
A.| s Thr Pro II. cy. PheSer Arg Cy. s~ *.rl*» Ser Ser U. Jj^ 
Ser Ala Glu A«p Glu II. Cly Cy. A.n Oln Thr Thr Gin Glu Jl^Aap 
Ser Ala Aen Thr Leu Gin lie Ala Glu U. Ly. Glu Ly. II. Gly Thr 

1300 1305 AJAW 

Arg ser Ala Clu An> Pro Val SerOlu Val Pro Ala Valuer Gin Hi. 



1315 
1 

1330 



pro ArgThr Ly. Mr Ser Aro^Leu Gin Gly Ser S.rL«i Ser ser Glu 



Serbia Arg Hi. Ly. AlaVal Glu Ph. Ser S^Gly Ala Ly. S«r PrO Q 
ser Ly. Ser Gly Almoin Thr Pro Ly. Se^Pro Pro Glu Hi. Tyr^al 

Gin Glu Thr Pro I*u Met Phe Ser Arg Cy. Thr Ser Val Ser Ser Leu 

1380 1385 mtv 

Asp St Phe Glu Ser Arg St Ue^Ala S« S« Val Ginger Glu Pro 



1395 
G 

1410 



Cys ser Gly Met Val St Civile lie St Pro Leu Pro Aep 
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Ser Pro Cly In Thr Met Pr Pro Ser Arg Ser Lye Thr Pro Pro Pro 
1425 1430 1435 1440 

Pro Pro Gin Thr Ala Gin Thr Lye Arg Glu Val Pro Lya Aan Lya Ala 
1445 1450 1455 

Pro Thr Ala Glu Lya Arg Glu Ser Gly Pr Lya Gin Ala Ala Val Aan 
1460 1465 1470 

Ala Ala Val Gin Arg Val Gin Val Leu Pro Aap Ala Asp Thr Leu Leu 
1475 1480 1485 

His Phe Ala Thr Glu Ser Thr Pro Aap Gly *he Ser Cya Ser Ser Ser 
1490 1495 1500 

Leu Ser Ala Leu Ser Leu Aap Glu Pro Phe He Gin Lya Aap Val Glu 

1505 1510 1515 1520 

Leu Arg Xlt Met Pro Pro Val Gin Glu Aan Aap Aan Gly Aan Glu Thr 
1525 1530 1535 

Glu Ser Glu Gin Pro Lya Glu Ser Aan Glu Aan Gin Glu Lya Glu Ala 
1540 1545 1550 

Glu Lya Thr He Aap Ser Glu Lya Aap Leu Leu Aap Aap Ser Aap Aap 

1555 1560 1565 

Aep Asp He Glu He Leu Glu Glu Cya He He Ser Ala Met Pro Thr 

1570 1575 1580 

Lya Ser Ser Arg Lya Ala Lya Lya Pro Ala Gin Thr Ala Ser Lya Leu 

1585 1590 1595 1600 

Pro Pro Pro Val Ala Arg Lya Pro Ser Gin Leu Pro Val Tyr Lya Leu 
1605 1610 1615 

Leu Pro Ser Gin Aan Arg Leu Gin Pro Gin Lya Hie Val Ser Phe Thr 
1620 1625 1630 

Pro Gly Aap Aap Met Pro Arg Val Tyr Cya Val Glu Gly Thr Pro He 
1635 1640 1645 

Aan Phe Ser Thr Ala Thr Ser Leu Ser Aap Leu Thr He Glu Ser Pro 
1650 1655 1660 

Pro Aan Glu Leu Ala Ala Gly Glu Gly Val Arg Gly Gly Ala Gin Ser 
1665 1670 1675 1680 

Gly Glu Phe Glu Lya Arg Aap Thr He Pro Thr Glu Gly Arg Ser Thr 
1685 1690 1695 

Aap Glu Ala Gin Gly Gly Lya Thr Ser Ser Val Thr He Pro Glu Leu 
1700 1705 1710 

Aap Aap Aan Lya Ala Glu Glu Gly Aap He Leu Ala Glu Cya He Aan 

* 1715 1720 1725 

Ser Ala Met Pro Lya Gly Lya Ser His Lya Pro Phe Arg Val Lya Lya 
1730 1735 1740 

He Met Aap Gin Val Gin Gin Ala Ser Ala Ser Ser Ser Ala Pro Aan 
1745 1750 1755 1760 
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Ly. A.n Gin fu Asp^Gly ty. tys ty. tys^Pro Thr S.r Pro V.^Ly. 
Pro II. Pro GXn A«> Thr Glu Tyr ArgThr Arg V«l Arg ty. A-n Al. 

1780 1785 A7 ' u 

Asp Ser s*. ton Asa Leu Asa JlaGlu Arg V.l Ph. Ser^p A.» ty. 

Asp ser ty. tys Gin A«n Leu Ly« Asn Aan s.r Lye Aep Ph. Aan Asp 

1810 181S lBZO 

Lys Leu Pro Asn Asn Glu Aap Arg V.1 Arg ClySer Ph. Ala Ph. Aep 

1825 I 830 1835 

Ser Pro Hi. Hi. Tyr Thr Pro II. Gla Gly Thr Pro Tyr Cy. Ph. Ser 

1845 1850 *w» 

Arg A«n Asp Ser Leu S.r Ser leu Asp Phe Asp Asp Asp Asp VH Asp 

I860 1865 M/u 

Leu ser Arg Glu Lys Al. Glu Leu Arg ty. Ala Ly. Olu Asn tya Glu 

1875 1880 *»85 

Ser Glu Al* Ly. val Thr Ser Hie Thr Glu Leu Thr Ser A.n Gin Gin 
1890 1895 1900 

S.r Ala Asa ty. Thr Gin Ala II. Al. tya Gla Pro II. Aan Arg Gly 
1905 1°10 1919 

Ola Pro tys Pro II. teu Gla ty. Gla Ser Thr Ph. Pro Ola Ser Ser 

1925 1930 i«» 

Ly. A«P II. ProA-P Arg ©ly Al* JJj^ *«P 01u **• g^ 61 " 

Phe Ala lie Glu Aan Thr Pro Val Cya Phe Ser His Aaa Ser Ser Leu 

1955 1960 X»os 

Ser Ser Leu Ser Asp II- Asp Ola Glu Asa Asa Asa ty. Glu Asa Glu 

X970 1975 19 

Profile ty. olu Thr OluPro Pro Asp ser Gladly Glu Pro Ser ly^ 

Pro Ola Ala Ser Gly Tyr Al. Pro Lye SerPhe His Val Olu Aap^Thr 

Pro val Cy. Phe^ Arg Asa Ser Sjr^ Ser S.r x-u SjrUe **p 

Ser Olu Asp Asp leu teu Ola Olu Cy. II. Ser Ser Ala Met Pro ty. 

2035 2040 2w«s 

Ly. ty. ty. Pro Ser Arg J 8 ^ 1 ** el * *" f o6 V** Hi * *■* 
ArgA«» Met Oly Gly Ile^Leu Gly Glu A*> Le^Thr I*u J*p teu Ly^ 
Asp II. Ola Arg Pr^Asp Ser Glu Hi. GlyLeu Ser Pro Asp SerGla 
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Aan Phe A«p Trp Lye Ala II© Cln Glu Gly Ala Aen Ser He Val Ser 
2100 2105 2110 

Ser Leu His Cln Ala Ala Ala Ala Ala Cye Leu Ser Arg Gin Ala Ser 
2115 2120 2125 

Ser Asp Ser Aep Ser He Leu Ser Leu Lye Ser Gly He Ser Leu Gly 
2130 2135 2140 

Ser Pro Phe Hie Leu Thr Pro Aep Gin Glu Glu Lye Pro Phe Thr Ser 
2145 2150 2155 2160 

Asn Lye Gly Pro Arg He Leu Lye Pro Gly Glu Lye Ser Thr Leu Glu 
2165 2170 2175 

Thr Lye Lye He Glu Ser Glu Ser Lye Gly He Lye Gly Gly Lye Lye 
2180 2185 2190 

Val Tyr Lye Ser Leu He Thr Gly Lye Val Arg Ser Aen Ser Glu He 
2195 2200 2205 

ser Gly Gin Met Lye Gin Pro Leu Gin Ala Aen Met Pro Ser He Ser 

2210 2215 2220 

Aro Gly Arg Thr Met He Hie He Pro Gly Val Arg Aen ser Ser Ser 
2225 2230 2235 2240 

Ser Thr Ser Pro Val Ser Lye Lye Gly Pro Pro Leu Lye Thr Pro Ala 
2245 2250 2255 

Ser Lye Ser Pro Ser Glu Gly Gin Thr Ala Thr Thr Ser Pro Arg Gly 
2260 2265 2270 

Ala Lve Pro Ser Val Lye Ser Glu Leu Ser Pro Val Ala Arg Gin Thr 
2275 2280 2285 

ser Gin He Gly Gly Ser Ser Lye Ala Pro Ser Arg Ser Gly Ser Arg 
2290 2295 2300 

Aso ser Thr Pro Ser Arg Pro Ala Gin Gin Pro Leu Ser Arg Pro He 
2305 2310 2315 2320 

Gin Ser Pro Gly Arg Aen Ser He Ser Pro Gly Arg Aen Gly He Ser 
2325 2330 2335 

Pro Pro Aen Lye Leu Ser Gin Leu Pro Arg Thr Ser Ser Pro Ser Thr 
2340 2345 2350 

Ala Ser Thr Lye Ser Ser Gly Ser Gly Lye Met Ser Tyr Thr Ser Pro 
2355 2360 2365 

Gly Arg Gin Met Ser Gin Gin Aen Leu Thr Lye Gin Thr Gly Leu Ser 
2370 2375 2380 

Lye Aen Ala Ser Ser He Pro Arg Ser Glu Ser Ala Ser Lye Gly Leu 

2385 2390 2395 2400 

Aen Gin Met Aen Aen Gly Aen Gly Ala Aen Lye Lye Val Glu Leu Ser 
2405 2410 2415 

Ara Met Ser Ser Thr Lye Ser Ser Gly Ser Clu Ser Aep Arg Ser Glu 
* 2420 2425 2430 
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Arg Pro Val Leu Val Arg Gin Ser Thr Phe lie Lye Glu Ale Pro Ser 
2435 2440 2445 

Pro Thr Leu Arg Arg Lye Leu Glu Glu Ser Ale Ser Ph Glu Ser Leu 
2450 2455 2460 

Ser Pro Ser Ser Arg Pro Ala Ser Pro Thr AfgSer cm Ala Gin Thr 
2465 2470 2475 2480 

Pro Val Leu Ser Pro Ser Leu Pro Asp Met Ser Leu Ser Thr His ser 
2485 2490 2495 

Ser Val Gin Ala Gly Gly Trp Arg Lye Leu Pro Pro Aen Leu Ser Pro 
2500 2505 2510 

Thr lie Glu Tyr Aen Aep Gly Arg Pro Ala Lye Arg Hie Asp He Ala 
2515 2520 2525 

Ara Ser Hie Ser Glu Ser Pro Ser Arg Leu Pro He Aen Arg Ser Gly 
* 2530 2535 2540 

Thr Trp Lye Arg Glu Hie Ser Lye Hie Ser Ser Ser Leu Pro Arg Val 
2545 2550 2555 25©« 

Ser Thr Trp Arg Arg Thr Gly Ser Ser Ser Ser He Leu ser Ala Ser 
2565 2570 2575 

Ser Glu Ser Ser Glu Lye Ala Lye Ser Glu Aep Glu Lye Hie Val Aen 
2580 2585 2590 

Ser He Ser Gly Thr Lye Gin Ser Lye Glu Aen Gin Val Ser Ala Lye 
2595 2600 2605 

Gly Thr Trp Arg Lye He Lye Glu Aen Glu Phe Ser Pro Thr Aen Ser 
2610 2615 2620 

Thr Ser Gin Thr Val Ser Ser Gly Ala Thr Aen Gly Ala Glu Ser Lye 
2625 2630 2635 2640 

Thr Leu He Tyr Gin Met Ala Pro Ala Val Ser Lye Thr Glu Asp Val 
2645 2650 2655 

Trp Val Arg He Glu Aep Cye Pro He Aen Asn Pro Arg Ser Gly Arg 
2660 2665 2670 

Ser Pro Thr Gly Aen Thr Pro Pro Val He Aep Ser Val Ser Glu Lye 
2675 2680 2685 

Aim Aen Pro Asn He Lye Aep Ser Lye Asp Asn Gin Ala Lye Gin Asn 

2690 2695 2700 

Val Gly Asn Gly Ser Val Pro Met Arg Thr Val Gly Leu Glu Aen Arg 

2705 2710 2715 2720 

Leu Asn ser Phe He Gin Val Asp Ala Pro Asp Gin Lys Gly Thr Glu 

2725 2730 2735 

He Lys Pro Gly Gin Asn Asn Pro Val Pro Val Ser Glu Thr Aen Glu 
2740 2745 2750 

Ser Ser He Val Glu Arg Thr Pro Phe Ser Ser ser Ser Ser Ser Lys 
2755 2760 2765 
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His Ser Ser Pro Ser Gly Thr Val Ala Ala Arg Val Thr Pro Phe Asn 

2770 2775 2780 

Tyr Asn Pr Ser Pr Arg Lys Ser Ser Ala Aep Ser Thr Ser Ala Arg 
2785 2790 2795 2800 

Pro Ser Gin lie Pro Thr Pr Val Aen Asn Asn Thr Lye Lye Arg Asp 
2805 2810 2815 

Ser Lys Thr Asp Ser Thr Glu Ser Ser Gly Thr Gin Ser Pro Lys Arg 
2820 2825 2830 

His Ser Gly Ser Tyr Leu Val Thr Ser Val 
2835 2840 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 31 amino acids 

(B) TYPEs amino acid 

(C) STRANDEDNBSSs single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: ral2 (yeast) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8 J 

Leu Thr Gly Ala Lys Gly Leu Gin Leu Arg Ala Leu Arg Arg He Ala 

15 10 15 

Arg He Glu Gin Gly Gly Thr Ala He Ser Pro Thr Ser Pro Leu 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNBSSs single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: m3(mAChR) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Leu Tyr Trp Arg He Tyr Lys Glu Thr Glu Lys Arg Thr Lys Glu Leu 
1 5 10 15 
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Ala Gly I*u Gin JU« S.r Cly Thr Glu XI* lu Thr Glu 

20 2 

(2) INFORMATION FOR SEQ ID NO: 10: 

(il SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY J linear 

(ii) MOLECULE TYPE: peptide 

#vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

/ v ii) IMMEDIATE SOURCE: 

(B) CLONE: MCC 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 10: 

I.eu Tyr Pro A.n Leu Ala Glu Glu Arg Ser Arg Trp Glu Lys Glu Leu 
1 5 

Ala Gly Leu Arg Glu Glu Asn Glu Ser Leu Thr Ala Met 

20 25 

(2) INFORMATION FOR SEQ ID NO: 11: 

fi> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

CTATCAAGAC TGTGACTTTT AATTGTAGTT TATCCATTTT 40 

(2) INFORMATION FOR SEQ ID NO: 12: 

fil SEQUENCE CHARACTERISTICS: 
* (A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

* (A) ORGANISM: Homo sapiens 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TTTAGAATTT GATGTTAATA TATTCTCTTC TTTTTAACAC 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS I single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GTAGATTTTA AAAAGCTGTT TTAAAATAAT TTTTTAAGCT 
(2) INFORMATION FOR SEQ ID NO:14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO:14: 
AACCAATTGT TGTATAAAAA CTTGTTTCTA TTTTATTTAG 
(2) INFORMATION FOR SEQ ID NO:15f 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISMS Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 s 
GTAACTTTTC TTCATATAGT AAACATTGCC TTGTGTACTC 
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(2) INFORMATION FOR SEQ ID NO: 16: 

#1) SEQUENCE CHARACTERISTICS » 

(A) LENGTH: 40 ba.e pair. 

(B) TYPE: nucl.ie acid 

(C) STRANDEDNESS: .ingle 

( D) TOPOLOGY: linear 

(ii) MOLECULE WPS: cDNA 

fvil ORIGINAL SOURCE: 

( } \K) ORGANISM: Hoop .apian. 

(xi) SEQUENCE DESCRIPTION: SEQ ID NOtl6« 
NNNNNNNNNN HNNGTCCCTT TTTTTAAAAA AAAAAAATAG 
(2 ) INFORMATION FOR SEQ ID NO J 17: 

Ii) SEQUENCE CHARACTERISTICS: 
11 7a) LENGTH: 40 ba.e pair. 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: .ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

fvil ORIGINAL SOURCE: 

(Vi) ^) ORGANISM: Homo .apien. 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GTAAGTAACT TGGCA6TACA JUOTMTTOA AACTTTAATA 

(2) INFORMATION FOR SEQ ID HO: 18: 

ril SEQUENCE CHARACTERISTICS : 
11 (A) LENGTH: 40 b— pair. 

(B) TYPE* nacleie acid 

(C) STRANDEDNESS: .ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(▼i) ORIGINAL SOURCE: 

1 ' { A) ORGANISM: Bono eapien. 



(xi) SEQUENCE DESCRIPTION: SEQ XD HO* 18: 
ATAGAAGATA TTGATACTTT TTTATTATTT OTGGTTTTAO 
(2) INFORMATION FOR SEQ ID HO:19« 

lil SEQUENCE CHARACTERISTICS: 
( ' (A)UBWTH* 40 ba». pair. 

(B) TYPEi nucleic acid 

(C) STRANDEDNESS: .ingle 

(D) TOPOLOGY t linear 
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<ii) MOLECULE TYPE* cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
GTAAGTTACT TGTTTCTAAG TGATAAAACA GYCAAGAGCT 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hooo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
AATAAAAACA TAACTAATTA GGTTTCTTGT TTTATTTTAG 
(2) INFORMATION FOR SEQ ID NO:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nuclaic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 1 
GTTAGTAAAT TSCCTTTTTT GTTTGTGGGT ATAAAAATAG 
(2) INFORMATION FOR SEQ ID NOi22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear . 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



40 
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<xi) SEQUENCE DESCRIPTION* SEQ ID N :22: 
ACCATTTTTC CATGTACTGA TGTTAACTCC ATCTTAACAG 40 
(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 
GTAAATAAAT TATTTTATCA TATTTTTTAA AATTATTTAA 
(2) INFORMATION FOR SEQ ID NO: 24: 

fi) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 
CATGATGTTA TCTGTATTTA CCTATAGTCT AAATTATACC ATCTATAATG TGCTTAATTT 

TTAG 

(2) INFORMATION FOR SEQ ID HO: 25: 

ti\ SEQUENCE CHARACTERISTICS: 
11 (A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE* cDNA 

(vi) ORIGINAL SOURCE: 

1 (A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
GTAACAGAAG ATTACAAACC CTGOTCACTA ATGCCATGAC TACTTTGCTA AG 52 



60 
64 
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(2) INFORMATI H FOR SEQ ID NOs26s 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTHS 46 base pair* 

(B) TYPE i nucleic acid 

(C) STRANDEDNBSS s single 
(O) TOPOLOGY s linear 

(ii) MOLECULE TYPES cDHA 

(Vi) ORIGINAL SOURCE i 

(A) ORGANISMS Homo sapiens 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NOs26s 
GGATATTAAA GTCGTAATTT TGTTTCTAAA CTCATTTGGC CCACAG 46 
(2) INFORMATION FOR SEQ ID NOs27s 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 40 base pairs 

(B) TYPES nucleic acid 

(C) STRANDEDNBSSs single 

(D) TOPOLOGY t liwtr 

(ii) MOLECULE TYPES CDNA 

(vi) ORIGINAL SOURCES 

(k) ORGANISMS Hoao sapiens 

(Xi) SEQUENCE DESCRIPTION! SEQ ID NOs27s 
GTATGTTCTC TATAGTGTAC ATCGTAGTGC ATGTTTCAAA 40 
(2) INFORMATION FOR SEQ ID NOs28s 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 56 base pairs 

(B) TYPES nucleic acid 

(C) STRANDEDKESS s single 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPES CDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Hcoo sapiens 



(xi) SEQUENCE DESCRIPTION s SEQ ID NO s 28s 
CATCATTGCT CTTCAAATAA CAAAGCATTA TGGTTTATGT TGATTTTATT TTTCAG 56 
(2) INFORMATION FOR SEQ ID NOs29: 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 43 base pairs 

(B) TYFEs nucleic acid 

(C) STRANDEDNBSSs single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISMS Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

GTAAGACAAA AATGTTTTTX AATCACATAG ACAATTACTG GTG 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS s 
(A> LENGTHS 40 base pairs 

(B) TYPEs nucleic acid 

(C) STRAHDEDNSSSs single 

(D) TOPOLOGY: linsar 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Hooo sapiens 

(Xi) SEQUENCE DESCRIPTION s SEQ ID NO: 30: 
TTAGATGATT GTCTTTTTCC TCTTGCCCTT TTTAAATTAG 
(2) INFORMATION FOR SEQ ID NOs31s 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 44 bass pairs 

(B) TYPSs nuclaic acid 

(C) STRANDEDNBSSs single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES CDMA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo sapiens 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NOs31s 
GTATGTTTTT ATAACATGTA TTTCTTAAGA TAGCTCAGGT AIGA 
(2) INFORMATION FOR SEQ ID NO: 32s 

Ii) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 54 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNBSSs single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES CDNA 

(Vi) ORIGINAL SOURCES 

(A) ORGANISM: Bono sapiens 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 32: 
CCTTGGCTTC AAGTTGNCTT TTTAATGATC CTCTATTCTG TATTTAATTT ACAG 54 
(2) INFORMATION POR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65 base pair* 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM! Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
GTACTATTTA GAATTTCACC TGTTTTTCTT TTTTCTCTTT TTCTTTGAGG CAGCCTCTCA 60 

65 

CTCTG 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISMS Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GCAACTAGTA TGATTTTATG TATAAATTAA TCTAAAATTG ATTAATTTCC AG 52 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 42 base pairs 

(B) TYPSs nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES cDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GTACCTTTGA AAACATTTAG TACTATAATA TGAATTTCAT GT 42 
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(2) INFORMATION FOR SEQ ID HO: 36: 

(L\ SEQUENCE CHARACTERISTICS : 

(A) LENGTH* 40 baM pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY i linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE! 

(A) ORGANISM: Homo sapiens 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
CCAACTCNAA TTAGATGACC CATATTCAGA AACTTACTAG 40 
(2) INFORMATION FOR SEQ ID NO: 37: 

iL\ SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 54 baae pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

fvi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
CTATATATAO AGTTTTATAT TAC*TTTAAA GTACAGAATT CATACTCTCA AAAA 54 
(2) INFORMATION FOR SEQ ID NO: 38: 

(L\ SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Li) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

11 ( A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 
ATTGTGACCT TAATTTT6TG ATCTCTTGAT TTTTATTTCA G 

(2) INFORMATION FOR SEQ H> NO:39: 

fi> SEQUENCE CHARACTERISTICS : 
1 ' (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



: 

41 
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(ii) MOLECULE TYPES cDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo sapiens 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
TCCCCGCCTG CCGCTCTC 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 18 base pairs 

(B) TYPES nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES cDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo sapiens 



(xi) SEQUENCE DESCRIPTION s SEQ ID NO: 40: 
GCAGCGGCGG CTCCCGTG 
(2) INFORMATION FOR SEQ ID NOs41s 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 20 base pairs 

(B) TYPES nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Hono sapiens 



(xi) SEQUENCE DESCRIPTION s SEQ ID NOs41s 
GTGAACGGCT CTCATGCTGC 
(2) INFORMATION FOR SEQ ID NOs42s 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 19 baM pairs 

(B) TYPBs nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPEs oDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISMS Hotao sapiens 
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(xi) SEQUENCE DESCRIPTION* SEQ ID N :42l 
ACGTGCGGGG AGGAATG6A 
(2) INFORMATION FOR SEQ ID NO:43: 

(i) SEQUENCE CHARACTERISTICS! 

(A) LENGTHS 24 base pairs 

(B) TYPE* nuclaic acid 

(C) STRANDEDNESSt single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES CDNA 

(vi) ORIGINAL SOURCES 

<A) ORGANISMS Homo sapiens 

(Xi) SEQUENCE DESCRIPTION t SEQ ID NOs43: 
ATGATATCTT ACCAAAT6AT ATAC 
(2) INFORMATION FOR SEQ ID NO: 44 1 

(i) SEQUENCE CHARACTERISTICS 3 

(A) LENGTH: 23 baee pairs 

(B) TYPBs nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGY 2 linear 

(ii) MOLECULE TYPES CDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo aapiene 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NO:44s 
TTATTCCTAC TTCTTCTATA CRG 
(2) INFORMATION FOR SEQ 10 NOs4Ss 

ri) SEQUENCE CHARACTERISTICS s 
1 (A) LENGTHS 21 base pairs 

(B) TYPBs nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY t linear 

(ii) MOLECULE TYPES cDNA 

(vi) ORIGINAL SOURCE: 

<A) ORGANISMS Homo sapiens 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NOs45s 
TACCCATGCT GGCTCTTTTT C 
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24 



23 



21 
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(2) INFORMATION FOR SEQ » NO*46s 

(i) SEQUENCE CHARACTERISTICS 5 

(A) LENGTH J 20 baaa pairs 

(B) TYPEx nuclaic acid 

(C) STRANDEDNESS* Singla 
(0) TOPOLOGY I linaar 

(ii) MOLECULE TYPE: cDNA 

(Vi) ORIGINAL SOURCE! 

(A) ORGANISMt Homo aapiana 



(Xi) SEQUENCE DESCRIPTION t SEQ ID NOs46s 
TGGGCCCATC TT G TT CC TGA 
(2) INFORMATION FOR SEQ ID NOs47s 

(i) SEQUENCE CHARACTERISTICS I 

(A) LENGTHS 22 baaa pairs 

(B) TYPES nuclaic acid 

(C) STRANDEDNESS s singla 

(D) TOPOLOGY s linaar 

(ii) MOLECULE TYPES CDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Rotno sapians 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NOs47s 
ACATTAGGCA CAAAGCTTGC AA 
(2) INFORMATION FOR SEQ ID NOs48s 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 22 base pairs 

(B) TYPES nuclaic aeid 

(C) STRANDEDNESS: singla 

(D) TOPOLOGY t linaar 

(ii) MOLECULE TYPES CDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo sapians 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NOs4Ss 
ATCAAGCTCC AGTAAGAAGG TA 
(2) INFORMATION FOR SEQ ID NOs49s 

(i) SEQUENCE CHARACTERISTICS S 

(A) LENGTHS 19 bass pair a 

(B) TYPBs nuclaic acid 

(C) STRANDEDNESS s singla 

(D) TOPOLOGY i linaar 
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(11) MOLECULE TYPE: cDNA 

fvl) ORIGINAL SOURCE: 

1 9 (A) RCANISMx Homo .apien. 

(xi) SEQUENCE DESCRIPTION: SEQ ID N :49: 
TCCGGCTCCT CGGTTCTTO 
(2) INFORMATION FOR SEQ ID NO: 50: 

lit SEQUENCE CHARACTERISTICS: 
(l> CA) LENGTH: 20 ba.e pair. 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: .ingle 

(D) TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
GCCCCTTCCT TTCTGAG6AC 
(2) INFORMATION FOR SEQ ID NO: 51: 

#1) SEQUENCE CHARACTERISTICS: 
1 1 (A) LENGTH: 21 base pair. 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : .ingle 

(D) TOPOLOGY x linear 

(11) MOLECULE TYPE: cDNA 

fvi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo .apien. 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
TTT T CT CCTC CCTCTTACTO C 
(2) INFORMATION FOR SEQ H> HO: 52: 
tL\ SEQUENCE CHARACTERISTICS : 

(B) TYPE: nucleic acid 

(C) StRMDEDNBSS: .ingle 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: cDNA 

fvi) ORIGINAL SOURCE: 

( (A) ORGANISM: Homo .apien. 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NOi52: 



ATGACACCCC CCATTCCCTC 

(2) INFORMATION FOR SEQ ID NOtS3: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTHS 24 basa pairs 

(B) TYPE: nuclaic acid 

(C) STRANDEDNESSt sing la 

(D) TOPOLOGY* linear 
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(ii) MOLECULE TYPES CDKA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo aapiaas 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 53s 
CCACTTAAAG CACATATATT TAGT 24 
(2) INFORMATION FOR SEQ ID NO:S4t 

(i) SEQUENCE CHARACTERISTICS I 

(A) LENGTHS 22 basa pairs 

(B) TYPEs nuclaic acid 

(C) STRANDEDNESSt sing la 

(D) TOPOLOGY s linaar 

(ii) MOLECULE TYPES CDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo sapiana 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 54* 
GTATGCAAAA TAGTGAAGAA CC 22 
(2) INFORMATION FOR SEQ ID NOt55s 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 24 baaa pairs 

(B) TYPE: nuclaic acid 

(C) STRANDEDNESSt sing la 

(D) TOPOLOGY: linaar 

(ii) MOLECULE TYPEs CDNA 

(vi) ORIGINAL SOURCE! 

(A) ORGANISM: Homo sapians 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
TTCTTAAGTC CTGTTTTTCT TTTG 24 
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(2) INFORMATION FOR SEQ ID NOs56: 

(l) SEQUENCE CHARACTERISTICS : 
X (A) LENGTH* 23 base pairs 

(B) TYPE* nucleic acid 

(C) STRANDEDHESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
TTTAGAACCT TTTTTGTGTT GIG 
(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS* 

(A) LENGTH: 24 bass pairs 

(B) TYPBt naelsic acid 

(C) STRANDEDHESS: singls 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES cDNA 

(vi) ORIGINAL SOURCE: 

1 (A) ORGANISM: Bono sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:S7: 
CTCAGATTAT ACACTAAOCC TAAC 
(2) INFORMATION FOR SEQ ID NO: 58: 

ti\ SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDHESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCES 

1 (A) ORGANISM: Bono sapiens 

(xi) SEQUEN C E DESCRIPTIONS SEQ ID NO: 58: 

CATGTCTCTT ACJU3TAGTAC CA 

(2) INFORMATION FOR SEQ ID NO: 59: 

M% SEQUENCE CHARACTERISTICS s 
1 1 (A) LENGTH: 20 base pairs 

(B) TYPES nucleic acid 

(C) STRANDEDHESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION i SEQ ID NO: 59: 
AGGTCCAAGG GTAGCCAAGG 20 
(2) INFORMATION FOR SEQ ID NO16O: 

(i y SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

<A) ORGANISM: Homo sapiens 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO J 60: 
TAAAAATGGA TAAACTACAA TTAAAAG 27 
(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
AAATACAGAA TCATGTCTTO AAGT 24 
(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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23 



(Xi) SEQUENCE DESCRIPTI N: SEQ ID NO: 62: 
ACACCTAAAG ATGACAATTT GAG 
(2) INFORMATION FOR SEQ ID N :63: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

24 

TAACTTAGAT AGCAGTAATT TCCC 

(2) INFORMATION FOR SEQ ID NO: 64: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH* 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ti) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

23 

ACAATAAACT GGAGTACACA AGG 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

23 

ATAGGTCATT GCTTCTTGCT GAT 
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(2) INFORMATION FOR SEQ ID NO: 66: 

<i) SEQUENCE CHARACTERISTICS 1 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: 
TGAATTTTAA TGGATTACCT AGGT 
(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS : • ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
CTTTTTTTGC TTTTACTGAT TAACG 
(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Bono sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
TGTAATTCAT TTTATTCCTA ATAGCTC 
(2) INFORMATION FOR SEQ ID NO:69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo sapiens 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:69: 

24 

GGTAGCCATA GTATGATTAT TTCT 

(2) INFORMATION FOR SEQ ID NO:70* 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 24 base pairs 

(B) TYFBs nucleic acid 

(C) STRANDSDNBSS: single 

(D) TOPOLOGY* linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Hcxoo sapiens 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 70: 

24 

CTACCTATTT TTATACCCAC AAAC 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 23 base pairs 

(B) TYPBs nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES cDNA 

(Vi) ORIGINAL SOURCES 

(A) ORGANISMS Hosno sapiens 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 71: 
AAGAAAGCCT ACACCATTTT TOC 
(2) INFORMATION FOR SEQ ID ROs72s 

(i) SEQUENCE CHARACTERISTICS s 
TaT LENGTHS 23 base pairs 

(B) TYPBs nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPES cDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Hotao sapiens 
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(xi) SEQUENCE DESCRIPTION : SEQ 10 N s72: 

GATCATTCTT AGAACCATCT TGC 

(2) INFORMATION FOR SEQ ID NOs73: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 24 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDEONESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo sapiens 



(Xi) SEQUENCE DESCRIPTIONS SEQ ID NOs73s 
ACCTATAGTC TAAATTATAC CATC 
(2) INFORMATION FOR SEQ ID NOs74s 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTHS 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(▼i) ORIGINAL SOURCES 

(A) ORGANISMS Bono sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
GTCATGGGAT TAGTGACCAG 
(2) INFORMATION FOR SEQ ID NO: 75s 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS s single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDKA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISMS Homo sapiens 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NOs75s 
AGTCGTAATT TT G TTTCTAA ACTC 
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(2) INFORMATI N FOR SEQ ID NO:76x 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 21 base pairs 

(B) TYPE* nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDKA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:76: 
TGAAGGACTC GGATTTCACG C 
(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDNESS * single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE* cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
TCATTCACTC ACAGCCTGAT GAC 
(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDHBSS t single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM* Homo sapiens 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 78: 
6CTTTGAAAC ATGCACTACG AT 
(2) INFORMATION FOR SEQ ID NO*79: 

fit SEQUENCE CHARACTERISTICS s 

(A) LENGTH* 24 base pairs 

(B) TYPE* nucleic acid 

(C) STRANDSDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPES CDNA 

(vi) ORIGINAL SOURCE l 

(A) ORGANISM: Homo sapiens 



(Xi) SEQUENCE DESCRIPTIONS SEQ 10 NO* 79: 
AAACATCATT GCTCTTCAAA TAAC 
(2) INFORMATION FOR SEQ 10 NO: 80; 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 24 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEONSSSs single 
(0) TOPOLOGY i linear 

(ii) MOLECULE TYPEs cONA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
TACCATGATT TAAAAATCCA CCAG 
(2) INFORMATION FOR SEQ ID NO: 81 J 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 23 base pairs 

(B) TYPEs nucleic acid 

(C) STRANOEDNESSs single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPES CDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Hoibo sapiens 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NOs81s 
G ATGATTGTC TTTTTCCTCT TGC 
(2) INFORMATION FOR SEQ ID NOs82: 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 24 base pairs 

(B) TYPEs nucleic acid 

(C) STRANOEDNESSs single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES CDNA 

(vi) ORIGINAL SOURCES 

(A) ORGANISMS Homo sapiens 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:82: 
CTCAGCTATC TTAAGAAATA CATC 
(2) INFORMATION FOR SEQ » NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO:83: 
TTTTAAATGA TCCTCTATTC TGTAT 
(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(Xi) SEQUENCE DESCRIPTION: SEQ U> NO:84: 
ACAGAGTCAG ACCCTGCCTC AAAG 
(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Bono sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:85: 
TTTCTATTCT TACTGCTAGC ATT 
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(2) INFORMATION FOR SEQ ID NOr86t 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 22 base pa ire 

(B) TYPE: nucleic acid 

(C) STRANDEONESS : single 

(D) TOPOLOGY: lin ar 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 
ATACACAGGT AAGAAATTAG GA 
(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
TAGATGACCC ATATTCTGTT TC 
(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 
CAATTAGGTC TTTTTGAGAG TA 
(2) INFORMATION FOR SEQ ID NO:89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) M LECULE TYPE: cDNA 

(vi) RIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
GTTACTGCAT ACACATTCTG AC 
(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo eapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
GCTTTTTGTT TCCTAACATG AAG 
(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
TCTCCCACAG GTAATACTCC C 
(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(xi) SEQUENCE DESCRXPTX N: SEQ ID NO:92: 
CCTAGAACTG AATGCGGTAC G 
(2) INF RMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 
CAGGACAAAA TAATCCTGTC CC 22 
(2) INFORMATION FOR SEQ ID NO* 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:* 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS * single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 
ATTTTCTTAG TTTCATTCTT CCTC 



24 
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CLAIMS 

1. A method of diagnosing or prognosing a neoplastic tissue 
of a human, comprising: 

detecting somatic alteration of wild-type APC gene cod- 
ing sequences or their expression products in a tumor tissue isolated 
froi^ a human, said alteration indicating neoplasia of the tissue. 

2. The method of claim 1 wherein the expression products 
are mRNA molecules. 

3. The method of claim 2 wherein the alteration of 
wild-type APC mRNA is detected by hybridization of mRNA from said 
tissue to an APC gene probe. 

4. The method of claim 1 wherein alteration of wild-type 
APC gene coding sequences is detected by observing shifts in 
electrophoretic mobility of single-stranded DNA on non-denaturing 
polyacrylamide gels. 

5. The method of claim 1 wherein alteration of wild-type 
APC gene coding sequences is detected by hybridization of an APC 
gene coding sequence probe to genomic DNA Isolated from said tissue. 

6. The method of claim 5 further comprising: 

subjecting genomic DNA isolated from a non-neoplastic 
tissue of the human to Southern hybridization with the APC gene cod- 
ing sequence probe; and 

comparing the hybridizations of the APC gene probe to 
said tumor and non-neoplastic tissues. 

7. The method of claim 5 wherein the APC gene probe 
detects a restriction fragment length polymorphism. 

8. The method of claim 1 wherein the alteration of 
wild-type APC gene coding sequences is detected by determining the 
sequence of all or part of an APC gene in said tissue using a polymerase 
chain reaction, deviations in the APC sequence determined from that 
of the sequence shown in Figure 7 (SEQ ID NO.: 1) suggesting neoplasia. 

9. The method of claim 1 wherein the alteration of wild- 
type APC gene coding sequences is detected by identifying a mismatch 
between molecules (1) an APC gene or APC mRNA isolated from said 
tissue and (2) a nucleic acid probe complementary to the human wild- 
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type APC gene coding sequence, when molecules (l) and (2) are hybrid- 
ized to each other to form a dupl x. 

10. The method of claim 5 wherein the APC gene probe 
hybridizes to an exon selected from the group consisting of: (1) 
nucleotides 822 to 930; and (2) nucleotides 931 to 1309; (3) nucleotides 
1406 to 1545; and (4) nucleotides 1956 to 2256. 

11. The method of claim 1 wherein the alteration of wild- 
type APC gene coding sequences Is detected by amplification of APC 
gene sequences in said tissue and hybridization of the amplified APC 
sequences to nucleic acid probes which comprise APC sequences. 

12. The method of claim 1 wherein the alteration of 
wild-type APC gene coding sequences is detected by molecular cloning 
of the APC genes in said tissue and sequencing all or part of the cloned 
APC gene. 

13. The method of claim 1 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
a deletion mutation. 

14. The method of claim 1 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
a point mutation. 

15. The method of 1 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
an insertion mutation. 

16. The method of claim 1 wherein the tumor tissue is a 

colorectal tissue. 

17. The method of tfaim 6 wherein the non-neoplastic tissue 
isolated from a human is from colonic mucosa. 

18. The method of claim 1 wherein toe expression products 
are protein molecules. 

19. The method of claim 18 wherein the alteration of 
wild-type APC protein is detected by immunoblotting. 

20. The method of claim 18 wherein the alteration of 
wild-type APC protein is detected by immunocytochemistry. 
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21. The method of claim 18 wherein the alteration of 
wild-type APC protein is detected by assaying for binding interactions 
between APC protein of said tumor tissue and a second cellular protein. 

22. The method of claim 21 wherein the second cellular pro- 
tein Is selected from the group consisting of MCC protein, wild-type 
APC protein, and a G protein. 

23. The method of claim 18 wherein the alteration of 
wild-type APC protein is detected by assaying for phospholipid 
metabolites. 

24. A method of supplying wild-type APC gene function to a 
cell which has lost said function by virtue of a mutation in an APC 
gene, comprising: 

introducing a wild-type APC gene into a cell which has 
lost said gene function such that said wild-type APC gene is expressed 
in the cell. 

25. The method of claim 24 wherein the wild-type APC gene 
introduced recombines with the endogenous mutant APC gene present 
in the cell by a double recombination event to correct the APC gene 
mutation. 

26. A method of supplying wild-type APC gene function to a 
cell which has altered APC function by virtue of a mutation in an APC 
gene, comprising: 

introducing a portion of a wild-type APC gene into a cell 
which has lost said gene function such that said portion is expressed in 
the cell, said portion encoding a part of the APC protein which is 
required for non-neoplastic growth of said cell. 

27. A method of supplying wild-type APC gene function to a 
cell which has altered APC function by virtue of a mutation in an APC 
gene, comprising: 

applying human wild-type APC protein to a cell which has 
lost wild-type APC function. 

28. A method of supplying wild- type APC gene function to a 
cell which has altered APC gene function by virtue of a mutation in an 
APC gene, comprising: 
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introducing into th cell a molecule which mimics the 
function of wild-type APC protein. 

29. A pair of single stranded DNA primers for determination .* 
of a nucleotide sequence of an APC gene by polymerase chain reaction, 

the sequence of said primers being derived from chromosome Sq band + 
21, wherein the use of said primers in a polymerase chain reaction 
results in synthesis of DNA having all or part of the sequence shown in 
Figure 7. 

30. The primers of claim 29 which have restriction enzyme 
sites at each 5' end. 

31. The pair of primers of claim 29 having sequences corre- 
sponding to APC introns. 

32. A nucleic acid probe complementary to human wild-type 
APC gene coding sequences. 

33. The nucleic acid probe of claim 31 which hybridizes to an 
exon selected from the group consisting of: (1) nucleotides 822 to 930; 
and (2) nucleotides 931 to 1309; (3) nucleotides 1406 to 1545; (4) 
nucleotides 1956 to 2256. 

34. A kit for detecting alteration of wild-type APC genes 
comprising a battery of nucleic acid probes which in the aggregate 
hybridize to all nucleotides of the APC gene coding sequences. 

35. A method of detecting the presence of a neoplastic tissue 
in a human, comprising: 

detecting in a body sample isolated from a human alter- 
ation of a wild-type APC gene coding sequence or wild-type APC 
expression product, said alteration indicating the presence of a 
neoplastic tissue in the human. 

36. The method of claim 35 wherein said body sample is 
selected from the group consisting of serum, stool, urine and sputum. 

37. A method of detecting genetic predisposition to cancer, 
inducing familial adenomatous polyposis (FAP) and Gardner's Syndrome 
(GS), in a human comprising: 

detecting a germline alteration of wild-type APC gene 
cocfing sequences or their expression products in a human sample 
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selected from the group consisting of blood and f tal tissue, said alter- 
ation indicating predisposition to cancer. 

38. The method of claim 37 wherein the expression products 
are mRNA molecules. 

39. The method of claim 38 wherein the alteration of 
wild-type APC mRNA is detected by hybridization of jaRNA from said 
tissue to an APC gene probe. 

40. The method of claim 37 wherein alteration of wild-type 
APC gene coding sequences is detected by observing shifts in 
electrophoretic mobility of single-stranded DNA on non-denaturing 
polyacrylamide gels. 

41. The method of claim 37 wherein alteration of wild-type 
APC gene coding sequences is detected by hybridization of an APC 
gene coding sequence probe to genomic DNA isolated from said tissue. 

42. The method of claim 41 wherein the APC gene coding 
sequence probe detects a restriction fragment length polymorphism. 

43. The method of claim 37 wherein the alteration of 
wild-type APC gene coding sequences is detected by determining the 
sequence of all or part of an APC gene in said tissue using a polymerase 
chain reaction, deviations in the APC sequence determined from the 
sequence of Figure 7 suggesting predisposition to cancer. 

44. The method of claim 37 wherein the alteration of wild- 
type APC gene coding sequences is detected by identifying a mismatch 
between molecules (1) an APC gene or APC mRNA Isolated from said 
tissue and (2) a nucleic add probe complementary to the human wild- 
type APC gene coding sequence, when molecules (1) and (2) are hybrid- 
ized to each other to form a duplex. 

45. The method of claim 41 wherein the APC gene probe 
hybridizes to an exon selected from the group consisting of: 
(1) nucleotides 822 to 930; and (2) nucleotides 931 to 1309; (3) 
nucleotides 1406 to 1545 and (4) nucleotides 1956 to 2256. 

46. The method of claim 37 wherein the alteration of wild- 
type APC gene coding sequences is detected by amplification of APC 
gene se q uences in said tissue and hybridization of the amplified APC 
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sequences to nucleic acid probes which comprise APC gen coding 
sequences. 

47. The method of claim 37 wherein the alteration of 
wild-type APC gene coding sequences is detected by molecular cloning 
of the APC genes in said tissue and sequencing all or part of the cloned 
APC gene. 

48. The method of claim 37 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
a deletion mutation. 

49. The method of fliaim 37 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
a point mutation. 

50. The method of claim 37 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
an insertion mutation. 

51. The method of ol?*™ 37 wherein the expression products 
are protein molecules. 

52. The method of claim 51 wherein the alteration of 
wild-type APC protein is detected by immunoblotting. 

53. The method of claim 51 wherein the alteration of 
wild-type APC protein is detected by immunocytochemistry. 

54. The method of claim 51 wherein the alteration of 
wild-type APC protein is detected by assaying for binding interactions 
between APC protein isolated from said tissue and a second cellular 
protein. 

55. The method of claim 54 wherein the second cellular pro- 
tein is selected from the group consisting of MCC protein, wild-type 
APC protein and a G protein. 

56. A method of screening for genetic predisposition to can- 
cer, including familial adenomatous polyposis (FAP) and Gardner's Syn- 
drome (GS), in a human comprising: 

detecting among kindred persons the presence of a DNA 
polymorphism which is linked to a mutant APC allele in an individual 
having a genetic predisposition to cancer, said kindred being 
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• genetically related to the individual, the presence of said polymorphism 
suggesting a predisposition to cancer. 

57. A preparation of the human APC protein substantially 
free of other human proteins, the amino acid sequence of said protein 
corresponding to that shown in Figure 3 or 7 (SEQ ID NO: 1). 

58. A preparation of antibodies immunoreactive with a 
human APC protein and not substantially immunoreactive with other 
human proteins. 

59. A method of testing therapeutic agents for the ability to 
suppress a neoplasticaUy transformed phenotype, comprising: 

applying a test substance to a cultured epithelial cell 
which carries a mutation in an APC allele; 

determining whether said test substance suppresses the 
neoplastically transformed phenotype of the cell. 

60. The method of claim 59 wherein the cultured epithelial 
cell has been genetically engineered to carry the mutation in the APC 
allele. 

61. A method of testing therapeutic agents for the ability to 
suppress neoplastic growth, comprising: 

administering a test substance to an animal which carries 
a mutant APC allele in its genome; 

determining whether said test substance prevents or sup- 
presses the growth of tumors. 

62. A transgenic animal which carries a mutant APC allele 
from a second animal species in its genome. 

63. An animal which has been genetically engineered to con- 
tain an insertion mutation which disrupts an APC allele in its genome. 

64. A cDNA molecule which encodes a protein having the 
amino acid sequence shown in Figure 3 or 7 (SEQ ID NO: 7 or 1). 

65. An isolated DNA molecule which encodes a protein having 
the amino acid sequence shown in Figure 3 or 7 (SEQ ID NO: 7 or 1). 

66. A yeast artificial chromosome which is known as 37HG4. 
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TABLE HA 

Germilne mutations of the APC gene in FAP and GS Patients 



NUCLEOTIDE AMINO 

EXTRA-COLONIC 
PATIENT CODON CHANGE 



DISEASE 

93 279 



Osteoma 
24 
34 
Tumor 
21 
Osteoma 

60 
Osteoma 



301 
301 

413 
712 



*CA->TGA 

CG*->TGA 
CGA->TGA 

CCC->TCC 

TCA->TGfc 



Arg*>stop 
Arg->8top 

Arg->cya 



ACID 



CHANGE AGE 

S«r->stop 



39 Mandibular 

46 Sons 

27 Desmoid 

24 Mandibular 



Ser->Stop 37 Mandibular 



3746 


243 


CACAC->CAG 


splice-junction 


3460 


301 


CGA->JGA 


Arg->Stop 


3827 


456 


CTTTCA->CTTCA 


fraawshift 


3712 


500 


T->C 


*yr->stcp 



* The mutated nucleotides are underlined. 
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TABLE IIB 



Somatic Mutations in Sportdic CRC Patients 



ns 

T16 

T47 

Til 

T35 

T9I 

T34 

T27 

T135 

7301 



CODON 1 
MCC12 

MCCI45 

MCC267 
MCC490 
MCC506 
MCC69I 
APC2M 
A PC 331 
APC437 
A PC I33S 



fjyrt-EOTIDE CHANGE 

GAG/ftaa|a-> 
GAG/ftaaaa 

etca|/CGA-> 
tteag/GCA 

CGG»CTjG 

TCG->TIG 

COO->CAO 

GCT->GIT 



AMINO ACID CHANGE 

(Splice Donor) 

(Splice Acceptor) 

Arg->Lea 
Ser->Leu 
Arg->Gln 
Ala->Var 



CCAGT->C CCAGCCA GT (Insertion) 

CGA->TJGA ArgoStop 

CAA/f tu->CAA/gsu (Spike Donor) 

CAG->T.AG GlnoStop 



For splice site mutations, the codon nearest to the mutation is listed 

The underlined nucleotides were mutant; small case letters represent introns, brf e case letters represent exons 
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TABLE m 



Swnc— of Pnmtn LU#d tor SSCP Arafym 



rtimmr S 



l ff^-TCOCecCCTCCCCCTCTC 
• O^-TACCOlTOCTCCCTCrTTTTC 

i Bm ** g * < ^ fcjMg< **rpLAf>i i i nun 



wit 



t 
i 
j 

4 
S 




CCCTTCCTTTCW 



'U.U11U1UUM 



rmnwmm 



991. $ 




w nrun n mini uiiumui* 



AM phm*$ ar» r»«d in *• 5* to ? tfroctfert. Tho Oat primr to mc 
pi* 0m S* Of m« oson tt ampCSt*: 0*0 Mcono jrimtr Ho* y of 9it txc 

UP ?ttf«Mftts tfto - 21M13 urnvrstf :-m«f j*rw«nc«: ftP rvproMn: 

tfl# M*3 tvOOO ptfflOr MQUOACt. 
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TABLE IV 



Saven Different Versions of ths 20- Amino Add Repeal 
Consensus: F* VE« TP' CFSR'SSLSSLS 

1262: YCVBOTPI CFSRCSSISSIS 

1378: KYVQETPLMFSRCTSVSSLO 

1492: FATESTPOQFSCSSSLSALS 

1$43: YCVEGTPI NFSTAT5LS01T 

1848: T P I EQ7P YCFSANOSLSSL D 

1953: FAI ENTPVCPSHNSSL SSLS 

20^3: FHVEOTPVCFSRNSSLSSLS 

Numbers danete l*a first ammo acid of eacn repeat The consensus 
sequence at tne tsp t aflects a majonty ammo aed at a given position. 
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TBI Amino Acid Sequence 

VAPWVGSGR APRHPAPAAM HPRRP06F0G L6YRGGAR0E QGFGGAFPAR SFSTGSDLGH 60 

WVTTPPDIPG SRNLHWGEKS PPYGVPTTST PYEGPTEEPF SSGGGGSVQG QSSEQLNRFA 120 

GF6IGLASLF TENVLAHPCI VLRRQCQVNY HAQNYHLTPF TVINIMYSFN KTQGPRALWK 180 

GMGSTFIVQG VTLGAEGIIS EFTPLPREVL HKWSPKQIGE HLLUCSLTYV VAMPFYSASL 240 

IETVQSEIIR ONTGILECVK EGIGRVIGMG VPHSKRLLPL LSLIFPTVLH 6VLHYIZSSV 300 

IGXFVLLILK RKTYNSHLAE STSPVQSMLO AYFPELIANF AASL CSDVIL YPLETVLHRL 360 

HTflfiTRTTTn MTDL6VPVLP TMTOYEfiMRP CINTIRQEE6 VF6FYK6F6A VIIQYTLHAA 420 

VLQITKIIYS TLLQ 434 



TB2 Amino Acid Sequence 

ELRRFDRFLH EKNCMTOLLA KLEAKTGVNR SFIALGVIGL VALYLVFGYG ASLLCNLIGF 60 

6YPAYISIKA IESPNKEOOT QWLTYWWYG VFSIAEFFSO IFLSWFPFYY IUCCGFLLWC 120 

KAPSPSHGAE LLYKRIIRPF FUCHESOHDS YYKDLKDKAIC ETADAITKEA KKATVNLLGE 180 

EKXST 185 
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APC AMINO ACID SEQUENCE 

KAAASYDQLL KQVEALKMEN SNLRQELEON SNHLTKLETE ASNMKEVLKQ LQGSZEDEAM 60 
ASSGQIOLLE RUCELNLDSS NFPGVKLRSK MSLRSYGSRE 6SVSSRSGEC SPVPMGSFPR 120 
RGFVNGSRES TGYLEELEXE RSLLLAOLOK EEKEKDWYYA QLQNLTKRIO SLLTENFSLQ 180 
TOMTRRQLEY EARQIRVAME EQLGTCQONE KRAQRRIARI QQIEKDILRI RQLLQSQATE 240 
AERSSQNKHE TGSHDAERQN EGQGVGEINM ATSGNGOGST TRMOHETASV LSSSSTHSAP 300 
RRLTSHLGTK VENVYSLLSN L6THDKD0MS RTLLAMSSSQ DSCISHRQSG CLPLLIQLLH 360 
6NDKDSVLL6 NSRGSKEARA RASAALHNII HSQPDDKRGR REIRVLHLLE QIRAYCETCW 420 
EWQEANEPGM OQDKNPMPAP VEHQZCPAVC VLMKLSFDEE HRHAMNELGG LQAIAELLQV 480 
DCEMYGLTND HYSITLRRYA 6HALTNLTFG DVANKATLCS MKGCMRALVA QLKSESEDLQ 540 
QVIASVLRNL SWRAOVNS10C TLREVGSVKA LMECALEVKK ESTUCSVLSA LWNLSAHCTE 600 
NKAOXCAVDG ALAFLVGTLT YRSQTNTLAI ZESGGGILRN VSSLIATNEO HRQILRENNC 660 
LQTLLQHUCS HSLTIVSNAC GTLWNLSARN PKDQEAUIDH 6AVSHLKNU HSKHKMIAMG 720 
SAAALRNLMA NRPAKYKOAN ZNSPGSSLPS LHVRKQKALE AELOAQKLSE TFONIONLSP 780 
KASHRSKQRH KQSLY6DYVF OTNRHOONRS ONFNTGNHTV LSPYLNTTVL PSSSSSRGSL 840 
DSSRSEKDRS LERERGIGL6 NYHPATENPG TSSJCRGLQZS TTAAQXAKVM EEVSAIHTSQ 900 
EDRSSGSfTE LHCVTDERKA LRRSSAAHTH SNTYNFTKSE NSHRTCSMPY AKLEYKRSSN 960 
OSLNSVSSSD GYGKRGQMKP SZESYSEOOE SKFCSYGQYP ADLAHKZHSA NHMOONDGEL 1020 
DTPIMYSLKY SOEQLNSGRQ SPSQNERMAR PKHIZEOEIK QSEQRQSRHQ STTYPVYTES 1080 
TOOKHUCFQP HFGQQECVSP YRSRGAHGSE TNRV6SNH6I NQNVSQSLCQ EDOYEOOKPT 1140 
NYSERYSEEE QHEEEERPTN YSZKYNEEKR HVDQPZOYSL KYATDZPSSQ KQSFSFSKSS 1200 
SGQSSKTEHM SSSSEMTSTP SSNAKRQNQL HPSSAQSRSG QPQJCAATOCV SSZNQETZQT 1260 
YCVEOTPZCF SRCSSLSSLS SAEOEZGCNQ TTQOPOSANT LQZAEZKEKZ GTRSAEOPVS 1320 
EVPAVSQHPR TKSSRLQGSS LSSESARMCA VEFSSGAKSP SKSGAQTPKS PPEHYVQETP 1380 
LNFSRCTSVS SLOSFESRSZ ASSVQSEPCS GNVSGZZSPS OLPOSPGQTN PPSRSKTPPP 1440 
PPQTAQTKRE VPKNKAPTAE KRESGPKQAA VKAAVQRVQV LPDAOTLLHF ATESTPOGFS 1500 
CSSSLSALSL DEPFZQKDVE LRZKPPVQEN ONGNETESEQ PKESNENQEK EAEKTZDSEK 1560 
DLLOOSOOOO XEZLEECZZS AMPTKSSRKA KKPAQTASKL PPPVARKPSQ LPVYKLLPSQ 1620 
MRLQPQKHVS FTPGOOMPRV YCYEGTPZNF STATSLSOLT IESPPHELAA 6E6VRGGAQS 1680 
GEFEKROTZP TEGRSTOEAQ G6KTSSVTXP ELODNKAEEG OZUECZNSA MPKGKSHKPF 1740 
RVKKXMDQVQ QASASSSAPN KNQLDGKKXK PTSPVKPZPO NTEYRTRVRK NAOSKNNLNA 1800 
ERVFSDNKDS KHQNLKNNSK DFNDKLPNNE DRVRGSFAFD SPHHYTPZEG TPYCFSRNOS 1860 
LSSLOFDODO VDLSREKAEL RKAKENKESE AKVTSHTELT SMQQSAMCTQ AZAKQPZNRG 1920 
QPKPZLQJCQS TFPQSSXDZP ORGAATOEKL QKFAZENTPV CFSKNSSLSS LSOZDQENNN 1980 
KENEPZKETE PPOSQGEPSK PQASGYAPKS FHVEDTPVCF SRNSSLSSLS XOSEOOLLQE 2040 
CZSSAMPnaC KPSRUCGONE KHSPRNNGGZ LGEOLTLOLK OZQRPOSEHG LSPOSENFDW 2100 
KAIQEGANSZ VSSLHQAAAA ACLSRQASSO SOSZLSLKSG ISL6SPFHLT POQEEKPFTS 2160 
NK6PRIUCPG EKSTLETKKZ ESESK6ZJC6G KXVYKSLZTG KVRSNSEISG QMKQPLQANM 2220 
PSZSRGRTMZ HIPGVRMSSS STSPVSXK6P PLKTPASKSP SEGQTATTSP RGAKPSVKSE 2280 
LSPVARQTSQ ZGGSSKAPSR SGSROSTPSR PAQQPLSRPZ QSPGRNSZSP GRNGZSPPNK 2340 
LSQLPRTSSP STASTKSSGS GXMSYTSPGR QMSQQNLTKQ TGLSKNASSZ PRSESASKGL 2400 
NQMNNGNGAN KKVELSRNSS TKSSGSESOR SERPVLVRQS TFZKEAPSPT LRRKLEESAS 2460 
FESLSPSSRP ASPTRSOAQT PVLSPSLPON SLSTHSSVQA GGURKLPPNL SPTZEYNDGR 2520 
PAKRHOZARS HSESPSRLPI NRSGTWKREH SKHSSSLPRV STURRTGSSS SZLSASSESS 2580 
EKAKSEDEKH VNSXSGTKQS KENQVSAK6T WRKZKENEFS PTNSTSQTVS SGATNGAESK 2640 
TUYQMAPAY SKTEOVWVRZ EDCPZNNPRS GRSPTGNTPP VZOSVSEXAN PNIKDSKDNQ 2700 
AKQNVGNGSV PMRTVGLENR LNSFZQVDAP OQKGTEZKPG QNNPVPVSET NESSIVERTP 2760 
FSSSSSSKHS SPSGTVAARV TPFMYMPSPR KSSAOSTSAR PSQZPTPVNN NTKKRDSKTD 2820 
STESSGTQSP KRHSGSYLVT SV 2842 
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APC 203 LGTCQOHEKRAQRRIARIQQIEiCDILRIRQL 233 

I :: II 111111:1 I I 
ral2 576 LTGAKGLQLRAUUtlARIEQGGTAISPTSPL 606 



APC 453 MKLSFDEEHRHAKNELG6LQAIAELLQVD 481 

: I : 11:1111* • • 

m3 MAChR 249 LYWRZYKETobtTKELAGLQASGTEAETE 277 
II : I : llllll 

MCC 220 LYPNLAEERSRUEKELA6LREENESLTAM 248 

• II:: 11:11 II 

APC 453 MKL^FDEEHRHAWELGGLQAIAELLQVD 481 



FIGURE 4 



WO 92/13103 



5/11 



PCI7US92/00376 




FIGURE 5 



WO 92/13103 



6/11 



PCI7US92/00376 




WO 92/13103 



7/11 



PCT/US92/00376 



«T *A« 4A* A** **• ' 



[ rr* Mt cm it? «* ce* «• *w 



i «u *4* urt < 



■i 



«m mc to mt ew «* cm «m c« «m «t mt rcr mt cat at aca m* 

«! — MC — «• «*• ~ «• *- *» »~ — *~ m £ 

l*J *** 
^^CTTM*TTAMTMCMTMTTT«CCTM*«T*MACWeMTWAM 

«, M, M. M» Mr »r «r •-jrj«.«r*»*ui-r»-r-*«- «* 
i. C-.m* fr* «x »• -ft Ut ~ «tr m* «M a- 

IT Z «. ~~ «r tr. u. «.•»■-«• W- 



i w« m «r m «i i*r m ce wc i 



f TTT 411 

i «M mt 



k «c t« in Ml 
» «iv era 



i cm «rr err «ca 
l «M W»l ii« au 



arAf7T«fl«eCMCt»MA«CTM*acrMAM«m 



«rirrniMMrTii!tTmc«iciiirmitf 



tm 



I Ml 

r mc a*t err *r 1 

lilt 

t wr «■ «& n» 

f «* «t? Ua MB 



M M«IMWI«U**K« 



— M»»mA«A»M»cM?w«AAi»«*«A«"*«»* ,BJ 

CM «••*•* CM «» «T AC* MB CM J 

m. mm nm «u u» «iy «r era «u am mt cm m« am am < 



mm Urn AM m* tt« «*» •** *** Wi»»tki«»«l 
nt 



err 

rear cm «cri 
i «r ■» «r i 



Sfta 4 M « r fM > W«««iffA>MK«UCfVI 



ttu 

2! m» w. m. »~ «r m. *u -f «« «.«.«»•»• *» 

Mnn*«ni»i-<««"" , » <h,taw,,,w 

wn 

M«M»«Mtrf»ll«1»rM»*«»« f » , * M 



«Mcrr act am 
ur u»iw 



r imam i 
a 



FIGURE 7 



WO 92/13103 



8/11 



PCT/US92/00376 



iMt ft* frt^t" ,«•« 

ttr «t « ut ut in im «t »' w * «t (fi »r m a» m mt *« «*i> m •** ^» •?» inn*mf» :«* 

III llf tU M« «)• «T tl» IN Mi M «tT 111 M M| If M II* AM m* «i« «W II* t#* **> «t« MM Uv mjT ** »»• *..» 

IIM MM M»* ••«• 

HrMtmtMnr^tvriMnimMCKTmaKMMHTm «* »*f n» ut • mt w «i m w «i y* «* n» ^ u» 

ivrtortMin^f Httifit««iwiMii«ilrMi*u*"'ww u« «*■ m# m «i r «■ u, iv m mv 4t« m *rt ^ «u i» m 

u» iim «ti «4f« 

u<MriuMu<«CiMUca*MCi(mm«MitriMin m* cm uc cu cm «m «» ci tu im «y .f r- ui ^ n >n u< 



«rwcuMftCuiciMiwratt«irto«TfircersTfnfier«i «irw«tiitiiT«f «ri»imffia*a*'ff iph-t 

iwii««uiniwiMrK|MiiitMrnritoi»ri«*uif>TikriH] m m mm m * tu u. ti« * iu «« cyu.« ,.i w m * 
»*1 MM <m ,M4 

MC MfT «AT act MM CM CIV 4H) n« CM CM CKT fff MB CM CM CM Ttt CCA «C* M« tt* *t» «■» MA Mt M* Ml CO 4Ct 7M *ff V *SA *M "~*» 

NritoMM tfiiuuiiiiMiliPiiiiiMl(r«u«t(ttiM iw %ui w m m w« iif w « m .„ w -v ».» iw >#i * 



im »«• <«)i 

















un 


MM 


MfT *«I4 


WT 41 




MM 


MM j 


Mil «•«• 






MM 


MM J 


MM • UU 
















Mil 




MM tf» 



Mffl MM | UW IIM 



WO 92/13103 



9/11 



PCT/US92/00376 




WO 92/13103 10/11 PCT/US92/00376 



Mil «» 



WO 92/13103 



11/11 



PCT/US92/00376 




FIGURE 8 



INTERNATIONAL SEARCH REPORT 

iKmdenl AffUcstiaft No 



PCT/US 92/00376 



L CLASSIFICATION F SUBJECT MATTER (tf smmak dactfkarto* lyotoob Mf4y, iatort met** 



Accortfiag to Uttnutfeaai Pttost CUsstfteatio* (IPC) or to mot* Nitfamsi. Oassflcatioa aft* IPC 

Int. CI. 5 C12Q1/68; C12N15/12 



O. FIELDS SEARCHED 



S«uchW 



CbunWcufM Sysuu 



Sjatois 



Int. CI. 5 



C12Q ; 



C07K ; 



G01N 



to tfcaExmt tfctt i 



m. DOCUMENTS CONSIDERED TO BE RELEVANT * 



U nidi 



WO, A, 8 901 481 (IMPERIAL CANCER RESEARCH 

TECHNOLOGY) 23 February 1989 

see page 2, line 15 - page 6, line 12; claims 

BIOCHEMICAL AND BIOPHYSICAL RESEARCH 
COMMUNICATIONS. 

vol. 174, no. 1, 15 January 1991, DULUTH, 
MINNESOTA US 
pages 298 - 304; 

Y.H0SHIN0 ET AL. : 'Normal human chromosome 5, on 
which a familial adenomatous polyposis gene Is 
located, has tumor suppressive activity* 
see abstract 

see page 302, line 1 - page 303, line 9 

W0, A, 9 005 180 (THE REGENTS OF THE UNIVERSITY OF 

CALIFORNIA) 17 May 1990 

see page 7, line 25 - page 9, line 28; claims 

-/— 



26 



26-28 




IV. CERTIFICATION 



Dstaattte 



eftfc* 

01 JUNE 1992 



09 JUHflgff 



EUROPEAN PATENT OFFICE 



LUZZATT0 E.R. 




FmKXJBAm* 



No 



PCT/US 92/00376 



. DOCUMENTS CONSIDERED TO BE RELEVANT 



(CONTINUED ROM THE SECOND SHEET) 



QtHrtoo of Ona ii— t , with todkatmc, whoro ij otoo riiN . of too wfcwat 



No. 



|p X I SCIENCE. 

vol. 253, 9 August 1991, LANCASTER, PA US 
pages 661 * 665; 

K.W. KINZLER ET AL: 'Identification of FAP locus 
genes from chromosome 5q21' 
see the whole eocument 

|p jj I SCIENCE. 

vol. 253, 9 August 1991, LANCASTER, PA US 
pages 665 - 669; 

I.NISHISHO ET AL. : 'Mutations of chromosome 5q21 
genes 1n FAP and colorectal cancer patients' 
see the whole document 

SCIENCE. 

vol. 253, 9 August 1991, LANCASTER, PA US 
page 616; 

J.MARX: 'Gene Identified for Inherited cancer 
susceptibility' 



1,29,32, 
64-66 



1.37 



*—iCTflM/a»iiiiiiii nu— y«M» 



ANNEX TO THE INTERNATIONAL SEARCH REPORT 
ON INTERNATIONAL PATENT APPLICATION NO. 9200376 




SA 57001 



to the patent aocancoct cited in t 
>fBca EDP file m 

P>rtfcabrs woks are lacftfyfma for fepar^ 



—Tit Jnii it 

cited in icarca resort 








WO-A-8901481 


23-02-89 


EP-A- 
JP-T- 
US-A- 


0376968 
3503838 
5098823 


11-07-90 
29-08-91 
24-03-92 


WO-A-9005180 


17-05-90 


AU-A- 
CA-A- 
EP-A- 
JP-T- 


4635089 
2001815 
0440744 
3505675 


28-05-90 
30-04-90 
14-08-91 
12-12-91 



